Hover Effect

Projet détection et reconnaissance - Segmentation sémantique d'images pour un cas d'usage de conduite autonome

Présenté par team R.O.M.Y- M2-IADS-EL - Juin 2024

Yann BEN ABDERRAHMANE

Khedoudja Rym MERAD

Ophélie ENGASSER

Mike DURAN

Importation des librairies principales

In [ ]:
!pip install tensorflow
Collecting tensorflow
  Using cached tensorflow-2.16.1-cp311-cp311-win_amd64.whl.metadata (3.5 kB)
Collecting tensorflow-intel==2.16.1 (from tensorflow)
  Using cached tensorflow_intel-2.16.1-cp311-cp311-win_amd64.whl.metadata (5.0 kB)
Collecting absl-py>=1.0.0 (from tensorflow-intel==2.16.1->tensorflow)
  Using cached absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Collecting astunparse>=1.6.0 (from tensorflow-intel==2.16.1->tensorflow)
  Downloading astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB)
Collecting flatbuffers>=23.5.26 (from tensorflow-intel==2.16.1->tensorflow)
  Downloading flatbuffers-24.3.25-py2.py3-none-any.whl.metadata (850 bytes)
Collecting gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 (from tensorflow-intel==2.16.1->tensorflow)
  Using cached gast-0.5.4-py3-none-any.whl.metadata (1.3 kB)
Collecting google-pasta>=0.1.1 (from tensorflow-intel==2.16.1->tensorflow)
  Downloading google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes)
Collecting h5py>=3.10.0 (from tensorflow-intel==2.16.1->tensorflow)
  Downloading h5py-3.11.0-cp311-cp311-win_amd64.whl.metadata (2.5 kB)
Collecting libclang>=13.0.0 (from tensorflow-intel==2.16.1->tensorflow)
  Downloading libclang-18.1.1-py2.py3-none-win_amd64.whl.metadata (5.3 kB)
Collecting ml-dtypes~=0.3.1 (from tensorflow-intel==2.16.1->tensorflow)
  Using cached ml_dtypes-0.3.2-cp311-cp311-win_amd64.whl.metadata (20 kB)
Collecting opt-einsum>=2.3.2 (from tensorflow-intel==2.16.1->tensorflow)
  Using cached opt_einsum-3.3.0-py3-none-any.whl.metadata (6.5 kB)
Requirement already satisfied: packaging in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (23.2)
Collecting protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 (from tensorflow-intel==2.16.1->tensorflow)
  Using cached protobuf-4.25.3-cp310-abi3-win_amd64.whl.metadata (541 bytes)
Requirement already satisfied: requests<3,>=2.21.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (2.31.0)
Requirement already satisfied: setuptools in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (65.5.0)
Requirement already satisfied: six>=1.12.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (1.16.0)
Collecting termcolor>=1.1.0 (from tensorflow-intel==2.16.1->tensorflow)
  Using cached termcolor-2.4.0-py3-none-any.whl.metadata (6.1 kB)
Requirement already satisfied: typing-extensions>=3.6.6 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (4.10.0)
Collecting wrapt>=1.11.0 (from tensorflow-intel==2.16.1->tensorflow)
  Downloading wrapt-1.16.0-cp311-cp311-win_amd64.whl.metadata (6.8 kB)
Collecting grpcio<2.0,>=1.24.3 (from tensorflow-intel==2.16.1->tensorflow)
  Downloading grpcio-1.64.0-cp311-cp311-win_amd64.whl.metadata (3.4 kB)
Collecting tensorboard<2.17,>=2.16 (from tensorflow-intel==2.16.1->tensorflow)
  Using cached tensorboard-2.16.2-py3-none-any.whl.metadata (1.6 kB)
Collecting keras>=3.0.0 (from tensorflow-intel==2.16.1->tensorflow)
  Using cached keras-3.3.3-py3-none-any.whl.metadata (5.7 kB)
Collecting tensorflow-io-gcs-filesystem>=0.23.1 (from tensorflow-intel==2.16.1->tensorflow)
  Using cached tensorflow_io_gcs_filesystem-0.31.0-cp311-cp311-win_amd64.whl.metadata (14 kB)
Requirement already satisfied: numpy<2.0.0,>=1.23.5 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (1.26.4)
Collecting wheel<1.0,>=0.23.0 (from astunparse>=1.6.0->tensorflow-intel==2.16.1->tensorflow)
  Using cached wheel-0.43.0-py3-none-any.whl.metadata (2.2 kB)
Collecting rich (from keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
  Using cached rich-13.7.1-py3-none-any.whl.metadata (18 kB)
Collecting namex (from keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
  Using cached namex-0.0.8-py3-none-any.whl.metadata (246 bytes)
Collecting optree (from keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
  Using cached optree-0.11.0-cp311-cp311-win_amd64.whl.metadata (46 kB)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.16.1->tensorflow) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.16.1->tensorflow) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.16.1->tensorflow) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.16.1->tensorflow) (2024.2.2)
Collecting markdown>=2.6.8 (from tensorboard<2.17,>=2.16->tensorflow-intel==2.16.1->tensorflow)
  Downloading Markdown-3.6-py3-none-any.whl.metadata (7.0 kB)
Collecting tensorboard-data-server<0.8.0,>=0.7.0 (from tensorboard<2.17,>=2.16->tensorflow-intel==2.16.1->tensorflow)
  Using cached tensorboard_data_server-0.7.2-py3-none-any.whl.metadata (1.1 kB)
Collecting werkzeug>=1.0.1 (from tensorboard<2.17,>=2.16->tensorflow-intel==2.16.1->tensorflow)
  Downloading werkzeug-3.0.3-py3-none-any.whl.metadata (3.7 kB)
Requirement already satisfied: MarkupSafe>=2.1.1 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from werkzeug>=1.0.1->tensorboard<2.17,>=2.16->tensorflow-intel==2.16.1->tensorflow) (2.1.5)
Collecting markdown-it-py>=2.2.0 (from rich->keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
  Using cached markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from rich->keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow) (2.17.2)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich->keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
  Using cached mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Using cached tensorflow-2.16.1-cp311-cp311-win_amd64.whl (2.1 kB)
Using cached tensorflow_intel-2.16.1-cp311-cp311-win_amd64.whl (377.0 MB)
Using cached absl_py-2.1.0-py3-none-any.whl (133 kB)
Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Downloading flatbuffers-24.3.25-py2.py3-none-any.whl (26 kB)
Using cached gast-0.5.4-py3-none-any.whl (19 kB)
Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Downloading grpcio-1.64.0-cp311-cp311-win_amd64.whl (4.1 MB)
   ---------------------------------------- 0.0/4.1 MB ? eta -:--:--
    --------------------------------------- 0.1/4.1 MB 2.6 MB/s eta 0:00:02
   -- ------------------------------------- 0.3/4.1 MB 3.1 MB/s eta 0:00:02
   ---- ----------------------------------- 0.5/4.1 MB 3.5 MB/s eta 0:00:02
   ----- ---------------------------------- 0.5/4.1 MB 2.9 MB/s eta 0:00:02
   ------- -------------------------------- 0.7/4.1 MB 3.3 MB/s eta 0:00:02
   --------- ------------------------------ 0.9/4.1 MB 3.5 MB/s eta 0:00:01
   ----------- ---------------------------- 1.2/4.1 MB 3.8 MB/s eta 0:00:01
   -------------- ------------------------- 1.5/4.1 MB 4.0 MB/s eta 0:00:01
   ---------------- ----------------------- 1.7/4.1 MB 4.4 MB/s eta 0:00:01
   ------------------- -------------------- 1.9/4.1 MB 4.4 MB/s eta 0:00:01
   --------------------- ------------------ 2.2/4.1 MB 4.4 MB/s eta 0:00:01
   ------------------------ --------------- 2.5/4.1 MB 4.5 MB/s eta 0:00:01
   --------------------------- ------------ 2.8/4.1 MB 4.8 MB/s eta 0:00:01
   ------------------------------ --------- 3.2/4.1 MB 5.0 MB/s eta 0:00:01
   ---------------------------------- ----- 3.6/4.1 MB 5.3 MB/s eta 0:00:01
   -------------------------------------- - 4.0/4.1 MB 5.5 MB/s eta 0:00:01
   ---------------------------------------  4.1/4.1 MB 5.5 MB/s eta 0:00:01
   ---------------------------------------- 4.1/4.1 MB 5.1 MB/s eta 0:00:00
Downloading h5py-3.11.0-cp311-cp311-win_amd64.whl (3.0 MB)
   ---------------------------------------- 0.0/3.0 MB ? eta -:--:--
   --- ------------------------------------ 0.2/3.0 MB 7.3 MB/s eta 0:00:01
   ---- ----------------------------------- 0.4/3.0 MB 5.6 MB/s eta 0:00:01
   -------- ------------------------------- 0.6/3.0 MB 4.2 MB/s eta 0:00:01
   --------- ------------------------------ 0.7/3.0 MB 4.4 MB/s eta 0:00:01
   --------- ------------------------------ 0.7/3.0 MB 4.4 MB/s eta 0:00:01
   ---------- ----------------------------- 0.8/3.0 MB 2.7 MB/s eta 0:00:01
   ------------- -------------------------- 1.0/3.0 MB 3.1 MB/s eta 0:00:01
   ----------------- ---------------------- 1.3/3.0 MB 3.5 MB/s eta 0:00:01
   --------------------- ------------------ 1.6/3.0 MB 3.8 MB/s eta 0:00:01
   ------------------------ --------------- 1.8/3.0 MB 4.0 MB/s eta 0:00:01
   ----------------------------- ---------- 2.2/3.0 MB 4.2 MB/s eta 0:00:01
   --------------------------------- ------ 2.5/3.0 MB 4.6 MB/s eta 0:00:01
   ---------------------------------------  2.9/3.0 MB 4.9 MB/s eta 0:00:01
   ---------------------------------------- 3.0/3.0 MB 4.6 MB/s eta 0:00:00
Using cached keras-3.3.3-py3-none-any.whl (1.1 MB)
Downloading libclang-18.1.1-py2.py3-none-win_amd64.whl (26.4 MB)
   ---------------------------------------- 0.0/26.4 MB ? eta -:--:--
   ---------------------------------------- 0.2/26.4 MB 6.9 MB/s eta 0:00:04
   - -------------------------------------- 0.7/26.4 MB 8.4 MB/s eta 0:00:04
   - -------------------------------------- 1.0/26.4 MB 8.1 MB/s eta 0:00:04
   -- ------------------------------------- 1.3/26.4 MB 7.7 MB/s eta 0:00:04
   -- ------------------------------------- 1.9/26.4 MB 8.5 MB/s eta 0:00:03
   --- ------------------------------------ 2.4/26.4 MB 9.0 MB/s eta 0:00:03
   ---- ----------------------------------- 2.8/26.4 MB 9.0 MB/s eta 0:00:03
   ---- ----------------------------------- 3.3/26.4 MB 9.1 MB/s eta 0:00:03
   ----- ---------------------------------- 3.6/26.4 MB 9.2 MB/s eta 0:00:03
   ----- ---------------------------------- 3.6/26.4 MB 9.2 MB/s eta 0:00:03
   ------ --------------------------------- 4.2/26.4 MB 8.4 MB/s eta 0:00:03
   ------ --------------------------------- 4.6/26.4 MB 8.3 MB/s eta 0:00:03
   ------- -------------------------------- 5.0/26.4 MB 8.3 MB/s eta 0:00:03
   -------- ------------------------------- 5.4/26.4 MB 8.6 MB/s eta 0:00:03
   -------- ------------------------------- 5.8/26.4 MB 8.4 MB/s eta 0:00:03
   --------- ------------------------------ 6.1/26.4 MB 8.3 MB/s eta 0:00:03
   --------- ------------------------------ 6.6/26.4 MB 8.4 MB/s eta 0:00:03
   ---------- ----------------------------- 7.1/26.4 MB 8.7 MB/s eta 0:00:03
   ----------- ---------------------------- 7.5/26.4 MB 8.7 MB/s eta 0:00:03
   ------------ --------------------------- 8.0/26.4 MB 8.8 MB/s eta 0:00:03
   ------------ --------------------------- 8.4/26.4 MB 8.8 MB/s eta 0:00:03
   ------------- -------------------------- 8.8/26.4 MB 8.8 MB/s eta 0:00:03
   ------------- -------------------------- 9.1/26.4 MB 8.7 MB/s eta 0:00:02
   -------------- ------------------------- 9.6/26.4 MB 8.7 MB/s eta 0:00:02
   --------------- ------------------------ 10.0/26.4 MB 8.8 MB/s eta 0:00:02
   --------------- ------------------------ 10.5/26.4 MB 9.0 MB/s eta 0:00:02
   ---------------- ----------------------- 10.9/26.4 MB 9.0 MB/s eta 0:00:02
   ---------------- ----------------------- 11.1/26.4 MB 8.8 MB/s eta 0:00:02
   ---------------- ----------------------- 11.1/26.4 MB 8.4 MB/s eta 0:00:02
   ----------------- ---------------------- 11.3/26.4 MB 8.5 MB/s eta 0:00:02
   ----------------- ---------------------- 11.7/26.4 MB 8.4 MB/s eta 0:00:02
   ------------------ --------------------- 12.4/26.4 MB 8.4 MB/s eta 0:00:02
   ------------------- -------------------- 12.8/26.4 MB 8.4 MB/s eta 0:00:02
   -------------------- ------------------- 13.3/26.4 MB 8.4 MB/s eta 0:00:02
   -------------------- ------------------- 13.7/26.4 MB 8.4 MB/s eta 0:00:02
   -------------------- ------------------- 13.8/26.4 MB 8.4 MB/s eta 0:00:02
   --------------------- ------------------ 14.0/26.4 MB 8.5 MB/s eta 0:00:02
   --------------------- ------------------ 14.3/26.4 MB 8.2 MB/s eta 0:00:02
   ---------------------- ----------------- 14.6/26.4 MB 8.1 MB/s eta 0:00:02
   ---------------------- ----------------- 14.8/26.4 MB 8.0 MB/s eta 0:00:02
   ---------------------- ----------------- 15.0/26.4 MB 7.8 MB/s eta 0:00:02
   ----------------------- ---------------- 15.3/26.4 MB 7.7 MB/s eta 0:00:02
   ----------------------- ---------------- 15.6/26.4 MB 7.7 MB/s eta 0:00:02
   ------------------------ --------------- 16.0/26.4 MB 7.7 MB/s eta 0:00:02
   ------------------------ --------------- 16.2/26.4 MB 7.6 MB/s eta 0:00:02
   ------------------------- -------------- 16.5/26.4 MB 7.5 MB/s eta 0:00:02
   ------------------------- -------------- 16.8/26.4 MB 7.4 MB/s eta 0:00:02
   ------------------------- -------------- 17.1/26.4 MB 7.3 MB/s eta 0:00:02
   -------------------------- ------------- 17.5/26.4 MB 7.3 MB/s eta 0:00:02
   --------------------------- ------------ 18.0/26.4 MB 7.3 MB/s eta 0:00:02
   --------------------------- ------------ 18.2/26.4 MB 7.2 MB/s eta 0:00:02
   ---------------------------- ----------- 18.6/26.4 MB 7.1 MB/s eta 0:00:02
   ---------------------------- ----------- 18.9/26.4 MB 7.1 MB/s eta 0:00:02
   ---------------------------- ----------- 19.1/26.4 MB 7.0 MB/s eta 0:00:02
   ----------------------------- ---------- 19.4/26.4 MB 7.0 MB/s eta 0:00:02
   ----------------------------- ---------- 19.6/26.4 MB 7.0 MB/s eta 0:00:01
   ------------------------------ --------- 19.9/26.4 MB 6.8 MB/s eta 0:00:01
   ------------------------------ --------- 20.2/26.4 MB 6.7 MB/s eta 0:00:01
   ------------------------------- -------- 20.6/26.4 MB 6.7 MB/s eta 0:00:01
   ------------------------------- -------- 21.0/26.4 MB 6.7 MB/s eta 0:00:01
   -------------------------------- ------- 21.3/26.4 MB 6.7 MB/s eta 0:00:01
   -------------------------------- ------- 21.7/26.4 MB 7.1 MB/s eta 0:00:01
   --------------------------------- ------ 22.1/26.4 MB 7.0 MB/s eta 0:00:01
   ---------------------------------- ----- 22.5/26.4 MB 6.9 MB/s eta 0:00:01
   ---------------------------------- ----- 22.9/26.4 MB 6.9 MB/s eta 0:00:01
   ----------------------------------- ---- 23.2/26.4 MB 6.8 MB/s eta 0:00:01
   ----------------------------------- ---- 23.5/26.4 MB 6.7 MB/s eta 0:00:01
   ------------------------------------ --- 23.9/26.4 MB 6.7 MB/s eta 0:00:01
   ------------------------------------ --- 24.3/26.4 MB 7.0 MB/s eta 0:00:01
   ------------------------------------- -- 24.8/26.4 MB 7.1 MB/s eta 0:00:01
   ------------------------------------- -- 25.1/26.4 MB 7.2 MB/s eta 0:00:01
   -------------------------------------- - 25.3/26.4 MB 7.3 MB/s eta 0:00:01
   -------------------------------------- - 25.6/26.4 MB 7.4 MB/s eta 0:00:01
   ---------------------------------------  25.9/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.2/26.4 MB 7.2 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------  26.4/26.4 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------- 26.4/26.4 MB 5.4 MB/s eta 0:00:00
Using cached ml_dtypes-0.3.2-cp311-cp311-win_amd64.whl (127 kB)
Using cached opt_einsum-3.3.0-py3-none-any.whl (65 kB)
Using cached protobuf-4.25.3-cp310-abi3-win_amd64.whl (413 kB)
Using cached tensorboard-2.16.2-py3-none-any.whl (5.5 MB)
Using cached tensorflow_io_gcs_filesystem-0.31.0-cp311-cp311-win_amd64.whl (1.5 MB)
Using cached termcolor-2.4.0-py3-none-any.whl (7.7 kB)
Downloading wrapt-1.16.0-cp311-cp311-win_amd64.whl (37 kB)
Downloading Markdown-3.6-py3-none-any.whl (105 kB)
   ---------------------------------------- 0.0/105.4 kB ? eta -:--:--
   ---------------------------------------- 105.4/105.4 kB 5.9 MB/s eta 0:00:00
Using cached tensorboard_data_server-0.7.2-py3-none-any.whl (2.4 kB)
Downloading werkzeug-3.0.3-py3-none-any.whl (227 kB)
   ---------------------------------------- 0.0/227.3 kB ? eta -:--:--
   ---------------------------------------- 227.3/227.3 kB 7.0 MB/s eta 0:00:00
Using cached wheel-0.43.0-py3-none-any.whl (65 kB)
Using cached namex-0.0.8-py3-none-any.whl (5.8 kB)
Using cached optree-0.11.0-cp311-cp311-win_amd64.whl (245 kB)
Using cached rich-13.7.1-py3-none-any.whl (240 kB)
Using cached markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
Using cached mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Installing collected packages: namex, libclang, flatbuffers, wrapt, wheel, werkzeug, termcolor, tensorflow-io-gcs-filesystem, tensorboard-data-server, protobuf, optree, opt-einsum, ml-dtypes, mdurl, markdown, h5py, grpcio, google-pasta, gast, absl-py, tensorboard, markdown-it-py, astunparse, rich, keras, tensorflow-intel, tensorflow
Successfully installed absl-py-2.1.0 astunparse-1.6.3 flatbuffers-24.3.25 gast-0.5.4 google-pasta-0.2.0 grpcio-1.64.0 h5py-3.11.0 keras-3.3.3 libclang-18.1.1 markdown-3.6 markdown-it-py-3.0.0 mdurl-0.1.2 ml-dtypes-0.3.2 namex-0.0.8 opt-einsum-3.3.0 optree-0.11.0 protobuf-4.25.3 rich-13.7.1 tensorboard-2.16.2 tensorboard-data-server-0.7.2 tensorflow-2.16.1 tensorflow-intel-2.16.1 tensorflow-io-gcs-filesystem-0.31.0 termcolor-2.4.0 werkzeug-3.0.3 wheel-0.43.0 wrapt-1.16.0
In [ ]:
!pip install tqdm
Collecting tqdm
  Downloading tqdm-4.66.4-py3-none-any.whl.metadata (57 kB)
     ---------------------------------------- 0.0/57.6 kB ? eta -:--:--
     ------------- ------------------------ 20.5/57.6 kB 640.0 kB/s eta 0:00:01
     -------------------------------------- 57.6/57.6 kB 606.6 kB/s eta 0:00:00
Requirement already satisfied: colorama in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tqdm) (0.4.6)
Downloading tqdm-4.66.4-py3-none-any.whl (78 kB)
   ---------------------------------------- 0.0/78.3 kB ? eta -:--:--
   ---------------------------------------- 78.3/78.3 kB 2.2 MB/s eta 0:00:00
Installing collected packages: tqdm
Successfully installed tqdm-4.66.4
In [ ]:
import cv2 as cv
from PIL import Image
import numpy as np
import pandas as pd
import tensorflow as tf
print(tf.__version__)
import os
import shutil
import matplotlib.pyplot as plt
import seaborn as sns
import json
import tqdm
from IPython.display import clear_output
from collections import Counter

%matplotlib inline
2.16.1

Exploration des données

Détails des données du dataset

Dans le cadre du projet de détection et de reconnaissance, nous devrons exploiter un dataset qui se compose de plusieurs dossiers :

  • Un dossier de villes différentes : Ce dossier est classé par villes, montre des images prises en voiture et montrant la route
  • Un dossier de outputs : il contient un groupe d'images regroupées comme suit : (image, masque en couleurs qui montre toutes les classes reconnues, masque en gris qui regroupe certaines classes, chose qui réduit le nombre de classes détectées).

Pour plus d'ergonomie et dans le but d'une exploitation optimale des modèles et des ressources dont nous disposons, nous avons décidé de sélectionner seulement quelques villes pour l'entraînement et nous ajouterons graduellement de plus en plus de données pour obtenir de meilleurs résultats.

Les noms des images ont les particularités suivantes:

  • leftmg8bits = images originales (utilisées comme input X).
  • gtFine_labelIds = images annotées en niveaux de gris = masques où chaque pixel est mappé à son ID de classe (utilisées comme y). Ce sont les masques qui seront utilisés en association avec les images pour entraîner le modèle.
  • gtFine_color = images annotées en couleur = masques d'annotations où chaque classe est représentée par une couleur (utilisées à des fins de visualisation, pas utilisées comme input).
  • gtFine_instanceIds = ce sont les instances de chaque catégorie, à savoir les identifiants uniques attribués à chaque instance.
  • gtFine_polygons = polygones stockés dans un json (ce sont les annotations originales avant transformation au format raster).

Output : masque de segmentation où chaque pixel est classé dans sa catégorie. Même dimension que l'image d'entrée, avec 8 masques de prédictions pour les prédictions brutes, et 1 pour les prédictions argmax (1 pixel = 1 classe ayant la plus forte probabilité).

Dans ce projet nous avons choisi un encodage des masques avec les labelsIds : chaque pixel = une classe, plutôt qu'un encodage en one-hot où chaque pixel serait représenté par un vecteur de taille 8.

Il est possible de visualiser le dataset sur le repo GitHub lié à Cityscapes. https://github.com/mcordts/cityscapesScripts

Ce repo fournit le schéma des labels (une trentaine) et les 8 sous-catégories sur lesquelles nous allons fonder les sorties du modèle.

In [ ]:
from collections import namedtuple
In [ ]:
# a label and all meta information
Label = namedtuple( 'Label' , [

    'name'        , # The identifier of this label, e.g. 'car', 'person', ... .
                    # We use them to uniquely name a class

    'id'          , # An integer ID that is associated with this label.
                    # The IDs are used to represent the label in ground truth images
                    # An ID of -1 means that this label does not have an ID and thus
                    # is ignored when creating ground truth images (e.g. license plate).
                    # Do not modify these IDs, since exactly these IDs are expected by the
                    # evaluation server.

    'trainId'     , # Feel free to modify these IDs as suitable for your method. Then create
                    # ground truth images with train IDs, using the tools provided in the
                    # 'preparation' folder. However, make sure to validate or submit results
                    # to our evaluation server using the regular IDs above!
                    # For trainIds, multiple labels might have the same ID. Then, these labels
                    # are mapped to the same class in the ground truth images. For the inverse
                    # mapping, we use the label that is defined first in the list below.
                    # For example, mapping all void-type classes to the same ID in training,
                    # might make sense for some approaches.
                    # Max value is 255!

    'category'    , # The name of the category that this label belongs to

    'categoryId'  , # The ID of this category. Used to create ground truth images
                    # on category level.

    'hasInstances', # Whether this label distinguishes between single instances or not

    'ignoreInEval', # Whether pixels having this class as ground truth label are ignored
                    # during evaluations or not

    'color'       , # The color of this label
    ] )
In [ ]:
Label.category
Out[ ]:
_tuplegetter(3, 'Alias for field number 3')

Dans cette partie, nous donnons la particularité de chaque classe de départ (les 34), nom, ID, ID d'entraînement, ID de catégorie, et surtout le code de la couleur qui les représente sur les masques de couleurs

Sur les images colorées, ce sont les 34 labels qui sont représentés. Nous nous intéresserons dans ce projet à 8 classes uniquement.

In [ ]:
labels = [
    #       name                     id    trainId   category            catId     hasInstances   ignoreInEval   color
    Label(  'unlabeled'            ,  0 ,      255 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    Label(  'ego vehicle'          ,  1 ,      255 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    Label(  'rectification border' ,  2 ,      255 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    Label(  'out of roi'           ,  3 ,      255 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    Label(  'static'               ,  4 ,      255 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    Label(  'dynamic'              ,  5 ,      255 , 'void'            , 0       , False        , True         , (111, 74,  0) ),
    Label(  'ground'               ,  6 ,      255 , 'void'            , 0       , False        , True         , ( 81,  0, 81) ),
    Label(  'road'                 ,  7 ,        0 , 'flat'            , 1       , False        , False        , (128, 64,128) ),
    Label(  'sidewalk'             ,  8 ,        1 , 'flat'            , 1       , False        , False        , (244, 35,232) ),
    Label(  'parking'              ,  9 ,      255 , 'flat'            , 1       , False        , True         , (250,170,160) ),
    Label(  'rail track'           , 10 ,      255 , 'flat'            , 1       , False        , True         , (230,150,140) ),
    Label(  'building'             , 11 ,        2 , 'construction'    , 2       , False        , False        , ( 70, 70, 70) ),
    Label(  'wall'                 , 12 ,        3 , 'construction'    , 2       , False        , False        , (102,102,156) ),
    Label(  'fence'                , 13 ,        4 , 'construction'    , 2       , False        , False        , (190,153,153) ),
    Label(  'guard rail'           , 14 ,      255 , 'construction'    , 2       , False        , True         , (180,165,180) ),
    Label(  'bridge'               , 15 ,      255 , 'construction'    , 2       , False        , True         , (150,100,100) ),
    Label(  'tunnel'               , 16 ,      255 , 'construction'    , 2       , False        , True         , (150,120, 90) ),
    Label(  'pole'                 , 17 ,        5 , 'object'          , 3       , False        , False        , (153,153,153) ),
    Label(  'polegroup'            , 18 ,      255 , 'object'          , 3       , False        , True         , (153,153,153) ),
    Label(  'traffic light'        , 19 ,        6 , 'object'          , 3       , False        , False        , (250,170, 30) ),
    Label(  'traffic sign'         , 20 ,        7 , 'object'          , 3       , False        , False        , (220,220,  0) ),
    Label(  'vegetation'           , 21 ,        8 , 'nature'          , 4       , False        , False        , (107,142, 35) ),
    Label(  'terrain'              , 22 ,        9 , 'nature'          , 4       , False        , False        , (152,251,152) ),
    Label(  'sky'                  , 23 ,       10 , 'sky'             , 5       , False        , False        , ( 70,130,180) ),
    Label(  'person'               , 24 ,       11 , 'human'           , 6       , True         , False        , (220, 20, 60) ),
    Label(  'rider'                , 25 ,       12 , 'human'           , 6       , True         , False        , (255,  0,  0) ),
    Label(  'car'                  , 26 ,       13 , 'vehicle'         , 7       , True         , False        , (  0,  0,142) ),
    Label(  'truck'                , 27 ,       14 , 'vehicle'         , 7       , True         , False        , (  0,  0, 70) ),
    Label(  'bus'                  , 28 ,       15 , 'vehicle'         , 7       , True         , False        , (  0, 60,100) ),
    Label(  'caravan'              , 29 ,      255 , 'vehicle'         , 7       , True         , True         , (  0,  0, 90) ),
    Label(  'trailer'              , 30 ,      255 , 'vehicle'         , 7       , True         , True         , (  0,  0,110) ),
    Label(  'train'                , 31 ,       16 , 'vehicle'         , 7       , True         , False        , (  0, 80,100) ),
    Label(  'motorcycle'           , 32 ,       17 , 'vehicle'         , 7       , True         , False        , (  0,  0,230) ),
    Label(  'bicycle'              , 33 ,       18 , 'vehicle'         , 7       , True         , False        , (119, 11, 32) ),
    Label(  'license plate'        , -1 ,       -1 , 'vehicle'         , 7       , False        , True         , (  0,  0,142) ),
]

Nous allons transformer les classes précedantes en les 8 classes que nous voulons .

In [ ]:
# name to label object
name2label      = { label.name    : label for label in labels           }
# id to label object
id2label        = { label.id      : label for label in labels           }
# trainId to label object
id2category     = { label[4]   : label.category for label in labels  }
trainId2label   = { label.trainId : label for label in reversed(labels) }
# category to list of label objects
category2labels = {}
for label in labels:
    category = label.category
    if category in category2labels:
        category2labels[category].append(label)
    else:
        category2labels[category] = [label]
In [ ]:
trainId2label
Out[ ]:
{-1: Label(name='license plate', id=-1, trainId=-1, category='vehicle', categoryId=7, hasInstances=False, ignoreInEval=True, color=(0, 0, 142)),
 18: Label(name='bicycle', id=33, trainId=18, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(119, 11, 32)),
 17: Label(name='motorcycle', id=32, trainId=17, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 0, 230)),
 16: Label(name='train', id=31, trainId=16, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 80, 100)),
 255: Label(name='unlabeled', id=0, trainId=255, category='void', categoryId=0, hasInstances=False, ignoreInEval=True, color=(0, 0, 0)),
 15: Label(name='bus', id=28, trainId=15, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 60, 100)),
 14: Label(name='truck', id=27, trainId=14, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 0, 70)),
 13: Label(name='car', id=26, trainId=13, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 0, 142)),
 12: Label(name='rider', id=25, trainId=12, category='human', categoryId=6, hasInstances=True, ignoreInEval=False, color=(255, 0, 0)),
 11: Label(name='person', id=24, trainId=11, category='human', categoryId=6, hasInstances=True, ignoreInEval=False, color=(220, 20, 60)),
 10: Label(name='sky', id=23, trainId=10, category='sky', categoryId=5, hasInstances=False, ignoreInEval=False, color=(70, 130, 180)),
 9: Label(name='terrain', id=22, trainId=9, category='nature', categoryId=4, hasInstances=False, ignoreInEval=False, color=(152, 251, 152)),
 8: Label(name='vegetation', id=21, trainId=8, category='nature', categoryId=4, hasInstances=False, ignoreInEval=False, color=(107, 142, 35)),
 7: Label(name='traffic sign', id=20, trainId=7, category='object', categoryId=3, hasInstances=False, ignoreInEval=False, color=(220, 220, 0)),
 6: Label(name='traffic light', id=19, trainId=6, category='object', categoryId=3, hasInstances=False, ignoreInEval=False, color=(250, 170, 30)),
 5: Label(name='pole', id=17, trainId=5, category='object', categoryId=3, hasInstances=False, ignoreInEval=False, color=(153, 153, 153)),
 4: Label(name='fence', id=13, trainId=4, category='construction', categoryId=2, hasInstances=False, ignoreInEval=False, color=(190, 153, 153)),
 3: Label(name='wall', id=12, trainId=3, category='construction', categoryId=2, hasInstances=False, ignoreInEval=False, color=(102, 102, 156)),
 2: Label(name='building', id=11, trainId=2, category='construction', categoryId=2, hasInstances=False, ignoreInEval=False, color=(70, 70, 70)),
 1: Label(name='sidewalk', id=8, trainId=1, category='flat', categoryId=1, hasInstances=False, ignoreInEval=False, color=(244, 35, 232)),
 0: Label(name='road', id=7, trainId=0, category='flat', categoryId=1, hasInstances=False, ignoreInEval=False, color=(128, 64, 128))}
In [ ]:
id2category
Out[ ]:
{0: 'void',
 1: 'flat',
 2: 'construction',
 3: 'object',
 4: 'nature',
 5: 'sky',
 6: 'human',
 7: 'vehicle'}

Affichage d'une image originale et des couleurs de labels.

Import des données.

Pour plus d'ergonomie, nous avons sélectionné un échantillon de chaque dossier et nous les avons classés dans les dossiers X (données d'entrée avec les données brutes) et Y (données de sortie avec les masques).

In [ ]:
path_X = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_X/X_train_stuttgart/stuttgart_000001_000019_leftImg8bit.png"
path_y = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_y/y_train_mask_stuttgart/"
In [ ]:
image = Image.open(path_X)
image_color = Image.open(path_y + "stuttgart_000001_000019_gtFine_color.png")
print(image.mode, image_color.mode)
print(image.size, image_color.size)
RGB RGBA
(2048, 1024) (2048, 1024)
In [ ]:
plt.figure(figsize=(10, 10))
plt.subplot(2, 2, 1), plt.imshow(image)
plt.title('Image originale')
plt.axis('off')
plt.subplot(2, 2, 2), plt.imshow(image_color)
plt.title('Couleurs des labels')
plt.axis('off');
No description has been provided for this image

Sur les images colorées, ce sont les 34 labels qui sont représentés. Nous nous intéresserons dans ce projet à 8 classes uniquement. Il est essentiel d'analyser les données des villes différentes, afin d'assurer que la différence de villes n'influe pas sur la qualité des entrainnement .

In [ ]:
# histogramme de l'image pour les 3 canaux (rouge, vert, bleu)
image_np = np.array(image)

color = ('b','g','r')

for i,col in enumerate(color):
    histr = cv.calcHist([image_np],[i],None,[256],[0,256])
    plt.plot(histr,color = col)
    plt.xlim([0,256])
plt.title('Histogramme')
plt.xlabel('Valeurs des pixels')
plt.ylabel('Nombre de pixels')
plt.show()
No description has been provided for this image

L'histogramme nous permet de visualiser la distribution des valeurs de pixels pour chaque canal RBG. Nous voyons ici une répartition suivant le même pattern pour chaque canal, à savoir une majorité de valeurs de pixels dans la plage 25-140 environ, et une allure de la courbe bimodale (modes à 45 et 120 environ).

Affichage des masques d'annotations

In [ ]:
labels = Image.open(path_y + "stuttgart_000001_000019_gtFine_labelIds.png")
In [ ]:
print(labels.mode)
print(labels.size)
L
(2048, 1024)

Analyse des tailles.

On remarque que le masque présente les mêmes dimensions que l'image originale et est en niveaux de gris.

In [ ]:
matrix = np.array(labels)
print(matrix.shape)
print(np.unique(matrix))
(1024, 2048)
[ 1  3  4  5  7  8 11 13 17 19 20 21 23 24 26 27]

Il s'agit d'une matrice de labels, nous pouvons voir quels sont les labels représentés ici.

In [ ]:
# affichage
plt.figure(figsize=(10, 10))
plt.subplot(2, 2, 1), plt.imshow(labels, cmap='gray')
plt.title('Masque en niveaux de gris')
plt.axis('off')
plt.subplot(2, 2, 2), plt.imshow(labels)
plt.title('Masque en couleurs')
plt.axis('off');
No description has been provided for this image

Préparation des données

Cette partie consiste à appliquer les masques prédéfinis aux images brutes avant de les exploiter pour un modèle.

Démarche : nous n'avons pas sélectionné tout le dataset, pour des raisons de ressources. Nous avons sélectionné certaines villes pour constituer les datasets train, val et test, selon un ratio de 80:20 environ pour le train et le val, et environ 10% pour le dataset test.

In [ ]:
def load_image(image_path, size=(128, 128)):
    # Ouvrir l'image, redimensionner et normaliser
    image = Image.open(image_path).resize(size)
    return np.array(image) / 255.0  # Normalized RGB image

def load_mask(mask_path, size=(128, 128)):
    # Ouvrir le masque et le redimensionner
    mask = Image.open(mask_path).resize(size)
    return np.array(mask)  # Label IDs (no normalization)
In [ ]:
train_images = []
train_masks = []
val_images = []
val_masks = []
test_images = []
test_masks = []

# chemins des répertoires
train_images_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_X/X_train_stuttgart_tubingen_strasbourg_ulm_bremen_hamburg_zurich"
train_masks_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_y/y_train_mask_stuttgart_tubingen_strasbourg_ulm_bremen_hamburg_zurich"
val_images_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_X/X_val_frankfurt"
val_masks_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_y/y_val_mask_frankfurt"
test_images_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_X/X_test_jena"
test_masks_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_y/y_test_mask_jena"

# chargement des images et des masques pour l'ensemble d'entraînement
for filename in os.listdir(train_images_dir):
    if filename.endswith('_leftImg8bit.png'):
        image_path = os.path.join(train_images_dir, filename)
        mask_filename = filename.replace('_leftImg8bit.png', '_gtFine_labelIds.png')
        mask_path = os.path.join(train_masks_dir, mask_filename)

        if os.path.exists(mask_path):
            image = load_image(image_path, size=(128, 128))
            mask = load_mask(mask_path, size=(128, 128))
            
            train_images.append(image)
            train_masks.append(mask)

# chargement des images et des masques pour l'ensemble de validation
for filename in os.listdir(val_images_dir):
    if filename.endswith('_leftImg8bit.png'):
        image_path = os.path.join(val_images_dir, filename)
        mask_filename = filename.replace('_leftImg8bit.png', '_gtFine_labelIds.png')
        mask_path = os.path.join(val_masks_dir, mask_filename)

        if os.path.exists(mask_path):
            image = load_image(image_path, size=(128, 128))
            mask = load_mask(mask_path, size=(128, 128))
            
            val_images.append(image)
            val_masks.append(mask)

# chargement des images et des masques pour l'ensemble de test
for filename in os.listdir(test_images_dir):
    if filename.endswith('_leftImg8bit.png'):
        image_path = os.path.join(test_images_dir, filename)
        mask_filename = filename.replace('_leftImg8bit.png', '_gtFine_labelIds.png')
        mask_path = os.path.join(test_masks_dir, mask_filename)

        if os.path.exists(mask_path):
            image = load_image(image_path, size=(128, 128))
            mask = load_mask(mask_path, size=(128, 128))
            
            test_images.append(image)
            test_masks.append(mask)

# convertir les listes en tableaux numpy
train_images = np.array(train_images)
train_masks = np.array(train_masks)
val_images = np.array(val_images)
val_masks = np.array(val_masks)
test_images = np.array(test_images)
test_masks = np.array(test_masks)
In [ ]:
print(train_images.shape, train_masks.shape, val_images.shape, val_masks.shape, test_images.shape, test_masks.shape)
(1486, 128, 128, 3) (1486, 128, 128) (267, 128, 128, 3) (267, 128, 128) (119, 128, 128, 3) (119, 128, 128)
In [ ]:
train_masks[0]
Out[ ]:
array([[ 8, 10, 10, ..., 10, 10,  8],
       [15, 21, 21, ..., 21, 21, 16],
       [15, 21, 21, ..., 21, 21, 16],
       ...,
       [ 6,  7,  7, ...,  7,  7,  6],
       [ 6,  7,  7, ...,  7,  7,  6],
       [ 4,  4,  4, ...,  5,  5,  4]], dtype=uint8)
In [ ]:
np.unique(train_masks)
Out[ ]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33],
      dtype=uint8)

Renomer les labels. Les masques sont encodés avec les 34 labels. Il faut mapper les labels avec leurs catégories (8 en tout).

In [ ]:
def remap_labels_to_categories(mask):
    label_id_to_category_id = {
        0: 0,  # 'void'
        1: 0,  # 'void'
        2: 0,  # 'void'
        3: 0,  # 'void'
        4: 0,  # 'void'
        5: 0,  # 'void'
        6: 0,  # 'void'
        7: 1,  # 'flat'
        8: 1,  # 'flat'
        9: 1,  # 'flat'
        10: 1,  # 'flat'
        11: 2,  # 'construction'
        12: 2,  # 'construction'
        13: 2,  # 'construction'
        14: 2,  # 'construction'
        15: 2,  # 'construction'
        16: 2,  # 'construction'
        17: 3,  # 'object'
        18: 3,  # 'object'
        19: 3,  # 'object'
        20: 3,  # 'object'
        21: 4,  # 'nature'
        22: 4,  # 'nature'
        23: 5,  # 'sky'
        24: 6,  # 'human'
        25: 6,  # 'human'
        26: 7,  # 'vehicle'
        27: 7,  # 'vehicle'
        28: 7,  # 'vehicle'
        29: 7,  # 'vehicle'
        30: 7,  # 'vehicle'
        31: 7,  # 'vehicle'
        32: 7,  # 'vehicle'
        33: 7,  # 'vehicle'
        -1: 7   # 'vehicle'
    }

    # exclure les valeurs du masque qui ne sont pas présentes dans le dictionnaire
    mask[mask < 0] = 0
    mask[mask > 33] = 0

    # remapping de chaque valeur du masque à son identifiant de catégorie correspondant
    remapped_mask = np.vectorize(label_id_to_category_id.get)(mask)
    return remapped_mask

# remapping des masques dans train_masks, val_masks et test_masks
remapped_train_masks = np.array([remap_labels_to_categories(mask) for mask in train_masks])
remapped_val_masks = np.array([remap_labels_to_categories(mask) for mask in val_masks])
remapped_test_masks = np.array([remap_labels_to_categories(mask) for mask in test_masks])
In [ ]:
np.unique(remapped_train_masks)
Out[ ]:
array([0, 1, 2, 3, 4, 5, 6, 7])
In [ ]:
print(remapped_train_masks.shape, remapped_val_masks.shape, remapped_test_masks.shape)
(1486, 128, 128) (267, 128, 128) (119, 128, 128)

On remarque que la segmentation obtenue est satifaisante.

In [ ]:
# train
plt.figure(figsize=(8, 6))
plt.subplot(1, 2, 1)
plt.title('Train Image')
plt.imshow(train_images[0])
plt.axis('off')
plt.subplot(1, 2, 2)
plt.title('Train Mask')
plt.imshow(remapped_train_masks[0])
plt.axis('off')
plt.show()

# val
plt.figure(figsize=(8, 6))
plt.subplot(1, 2, 1)
plt.title('Validation Image')
plt.imshow(val_images[0])
plt.axis('off')
plt.subplot(1, 2, 2)
plt.title('Validation Mask')
plt.imshow(remapped_val_masks[0])
plt.axis('off')
plt.show()

# test
plt.figure(figsize=(8, 6))
plt.subplot(1, 2, 1)
plt.title('Test Image')
plt.imshow(test_images[0])
plt.axis('off')
plt.subplot(1, 2, 2)
plt.title('Test Mask')
plt.imshow(remapped_test_masks[0])
plt.axis('off')
plt.show()
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Implémentation du modèle U-NET

Le U-net est une architecture qui permet d'encoder une donnée (visuelle, sonore, numérique comme un vecteur ou un tuple) en un équivalent plus petit (réduction de dimmentionnalité).

In [ ]:
# création d'une classe pour ajouter la métrique IoU dans le fit
from tensorflow.keras.metrics import MeanIoU

class UpdatedMeanIoU(MeanIoU):
  def __init__(self,
               y_true=None,
               y_pred=None,
               num_classes=None,
               name=None,
               dtype=None):
    super(UpdatedMeanIoU, self).__init__(num_classes = num_classes,name=name, dtype=dtype)

  def update_state(self, y_true, y_pred, sample_weight=None):
    y_pred = tf.math.argmax(y_pred, axis=-1)
    return super().update_state(y_true, y_pred, sample_weight)
In [ ]:
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Conv2DTranspose, concatenate
from tensorflow.keras.models import Model

def unet(input_shape=(128, 128, 3)):
    inputs = Input(input_shape)

    # encoder
    conv1 = Conv2D(64, 3, activation='relu', padding='same')(inputs)
    conv1 = Conv2D(64, 3, activation='relu', padding='same')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

    conv2 = Conv2D(128, 3, activation='relu', padding='same')(pool1)
    conv2 = Conv2D(128, 3, activation='relu', padding='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

    # bridge

    conv3 = Conv2D(256, 3, activation='relu', padding='same')(pool2)
    conv3 = Conv2D(256, 3, activation='relu', padding='same')(conv3)

    # decoder 
    
    up4 = Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv3)
    up4 = concatenate([conv2, up4], axis=3)
    conv4 = Conv2D(128, 3, activation='relu', padding='same')(up4)
    conv4 = Conv2D(128, 3, activation='relu', padding='same')(conv4)

    up5 = Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(conv4)
    up5 = concatenate([conv1, up5], axis=3)
    conv5 = Conv2D(64, 3, activation='relu', padding='same')(up5)
    conv5 = Conv2D(64, 3, activation='relu', padding='same')(conv5)

    # output layer
    outputs = Conv2D(8, 1, activation='softmax')(conv5) # 8 classes de sortie

    # créer le modèle
    model = Model(inputs=inputs, outputs=outputs)
    return model

# créer une instance du modèle U-Net
model = unet()

# compiler le modèle
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(),
              metrics=['accuracy', UpdatedMeanIoU(num_classes=8, name = "mean_iou")])

Il est possible d'avoir plus de détail sur le model avec la fonction suivante

In [ ]:
model.summary()
Model: "functional_1"
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type)        ┃ Output Shape      ┃    Param # ┃ Connected to      ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ input_layer         │ (None, 128, 128,  │          0 │ -                 │
│ (InputLayer)        │ 3)                │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d (Conv2D)     │ (None, 128, 128,  │      1,792 │ input_layer[0][0] │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_1 (Conv2D)   │ (None, 128, 128,  │     36,928 │ conv2d[0][0]      │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ max_pooling2d       │ (None, 64, 64,    │          0 │ conv2d_1[0][0]    │
│ (MaxPooling2D)      │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_2 (Conv2D)   │ (None, 64, 64,    │     73,856 │ max_pooling2d[0]… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_3 (Conv2D)   │ (None, 64, 64,    │    147,584 │ conv2d_2[0][0]    │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ max_pooling2d_1     │ (None, 32, 32,    │          0 │ conv2d_3[0][0]    │
│ (MaxPooling2D)      │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_4 (Conv2D)   │ (None, 32, 32,    │    295,168 │ max_pooling2d_1[… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_5 (Conv2D)   │ (None, 32, 32,    │    590,080 │ conv2d_4[0][0]    │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose    │ (None, 64, 64,    │    131,200 │ conv2d_5[0][0]    │
│ (Conv2DTranspose)   │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate         │ (None, 64, 64,    │          0 │ conv2d_3[0][0],   │
│ (Concatenate)       │ 256)              │            │ conv2d_transpose… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_6 (Conv2D)   │ (None, 64, 64,    │    295,040 │ concatenate[0][0] │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_7 (Conv2D)   │ (None, 64, 64,    │    147,584 │ conv2d_6[0][0]    │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_1  │ (None, 128, 128,  │     32,832 │ conv2d_7[0][0]    │
│ (Conv2DTranspose)   │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate_1       │ (None, 128, 128,  │          0 │ conv2d_1[0][0],   │
│ (Concatenate)       │ 128)              │            │ conv2d_transpose… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_8 (Conv2D)   │ (None, 128, 128,  │     73,792 │ concatenate_1[0]… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_9 (Conv2D)   │ (None, 128, 128,  │     36,928 │ conv2d_8[0][0]    │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_10 (Conv2D)  │ (None, 128, 128,  │        520 │ conv2d_9[0][0]    │
│                     │ 8)                │            │                   │
└─────────────────────┴───────────────────┴────────────┴───────────────────┘
 Total params: 1,863,304 (7.11 MB)
 Trainable params: 1,863,304 (7.11 MB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# paramètres
TRAIN_LENGTH = len(train_images)
BATCH_SIZE = 64
BUFFER_SIZE = 1000  # nb d'éléments à garder en mémoire tampon lors du shuffle
STEPS_PER_EPOCH = TRAIN_LENGTH // BATCH_SIZE
EPOCHS = 100
VAL_SUBSPLITS = 5
VALIDATION_STEPS = len(val_images) // BATCH_SIZE // VAL_SUBSPLITS

# préparation des datasets
train_dataset = tf.data.Dataset.from_tensor_slices((train_images, remapped_train_masks))
train_dataset = train_dataset.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()

val_dataset = tf.data.Dataset.from_tensor_slices((val_images, remapped_val_masks))
val_dataset = val_dataset.batch(BATCH_SIZE)

Entraînement

In [ ]:
from tensorflow.keras.callbacks import EarlyStopping

model_history = model.fit(train_dataset, epochs=EPOCHS,
                          steps_per_epoch=STEPS_PER_EPOCH,
                          validation_steps=VALIDATION_STEPS,
                          validation_data=val_dataset,
                          callbacks=[EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)])
Epoch 1/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 360s 15s/step - accuracy: 0.3856 - loss: 1.6930 - mean_iou: 0.0918 - val_accuracy: 0.4704 - val_loss: 1.4609 - val_mean_iou: 0.1446
Epoch 2/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 294s 13s/step - accuracy: 0.5387 - loss: 1.2937 - mean_iou: 0.2300 - val_accuracy: 0.5644 - val_loss: 1.1870 - val_mean_iou: 0.2606
Epoch 3/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 315s 14s/step - accuracy: 0.6210 - loss: 1.0707 - mean_iou: 0.2971 - val_accuracy: 0.6019 - val_loss: 1.1072 - val_mean_iou: 0.2743
Epoch 4/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.6479 - loss: 1.0147 - mean_iou: 0.3160 - val_accuracy: 0.6639 - val_loss: 0.9969 - val_mean_iou: 0.3218
Epoch 5/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.6881 - loss: 0.9231 - mean_iou: 0.3589 - val_accuracy: 0.6957 - val_loss: 0.9077 - val_mean_iou: 0.3740
Epoch 6/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.7170 - loss: 0.8525 - mean_iou: 0.3977 - val_accuracy: 0.7066 - val_loss: 0.8818 - val_mean_iou: 0.3806
Epoch 7/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 321s 14s/step - accuracy: 0.7333 - loss: 0.8142 - mean_iou: 0.4162 - val_accuracy: 0.7227 - val_loss: 0.8413 - val_mean_iou: 0.3939
Epoch 8/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 345s 15s/step - accuracy: 0.7476 - loss: 0.7824 - mean_iou: 0.4288 - val_accuracy: 0.7390 - val_loss: 0.8009 - val_mean_iou: 0.4150
Epoch 9/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.7620 - loss: 0.7390 - mean_iou: 0.4440 - val_accuracy: 0.7418 - val_loss: 0.7945 - val_mean_iou: 0.4185
Epoch 10/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.7704 - loss: 0.7212 - mean_iou: 0.4528 - val_accuracy: 0.7590 - val_loss: 0.7528 - val_mean_iou: 0.4250
Epoch 11/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 322s 14s/step - accuracy: 0.7780 - loss: 0.6923 - mean_iou: 0.4595 - val_accuracy: 0.7726 - val_loss: 0.7129 - val_mean_iou: 0.4528
Epoch 12/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 327s 14s/step - accuracy: 0.7855 - loss: 0.6779 - mean_iou: 0.4701 - val_accuracy: 0.7460 - val_loss: 0.7755 - val_mean_iou: 0.4344
Epoch 13/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 322s 14s/step - accuracy: 0.7920 - loss: 0.6605 - mean_iou: 0.4790 - val_accuracy: 0.7735 - val_loss: 0.7062 - val_mean_iou: 0.4509
Epoch 14/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 315s 14s/step - accuracy: 0.7996 - loss: 0.6386 - mean_iou: 0.4871 - val_accuracy: 0.7763 - val_loss: 0.6898 - val_mean_iou: 0.4545
Epoch 15/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 333s 14s/step - accuracy: 0.8068 - loss: 0.6166 - mean_iou: 0.4945 - val_accuracy: 0.7771 - val_loss: 0.6812 - val_mean_iou: 0.4600
Epoch 16/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 328s 14s/step - accuracy: 0.8061 - loss: 0.6169 - mean_iou: 0.4960 - val_accuracy: 0.7911 - val_loss: 0.6575 - val_mean_iou: 0.4748
Epoch 17/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.8117 - loss: 0.6030 - mean_iou: 0.4986 - val_accuracy: 0.7872 - val_loss: 0.6643 - val_mean_iou: 0.4799
Epoch 18/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.8166 - loss: 0.5858 - mean_iou: 0.5093 - val_accuracy: 0.7692 - val_loss: 0.7107 - val_mean_iou: 0.4643
Epoch 19/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.8098 - loss: 0.6059 - mean_iou: 0.5033 - val_accuracy: 0.7996 - val_loss: 0.6263 - val_mean_iou: 0.4885
Epoch 20/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 319s 14s/step - accuracy: 0.8232 - loss: 0.5696 - mean_iou: 0.5201 - val_accuracy: 0.7953 - val_loss: 0.6327 - val_mean_iou: 0.4905
Epoch 21/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 322s 14s/step - accuracy: 0.8259 - loss: 0.5576 - mean_iou: 0.5268 - val_accuracy: 0.7981 - val_loss: 0.6284 - val_mean_iou: 0.5000
Epoch 22/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 310s 13s/step - accuracy: 0.8262 - loss: 0.5581 - mean_iou: 0.5317 - val_accuracy: 0.8004 - val_loss: 0.6196 - val_mean_iou: 0.4945
Epoch 23/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8299 - loss: 0.5463 - mean_iou: 0.5384 - val_accuracy: 0.7986 - val_loss: 0.6302 - val_mean_iou: 0.5032
Epoch 24/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 313s 14s/step - accuracy: 0.8346 - loss: 0.5335 - mean_iou: 0.5460 - val_accuracy: 0.8114 - val_loss: 0.5935 - val_mean_iou: 0.5274
Epoch 25/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 324s 14s/step - accuracy: 0.8390 - loss: 0.5178 - mean_iou: 0.5571 - val_accuracy: 0.8022 - val_loss: 0.6207 - val_mean_iou: 0.5172
Epoch 26/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 321s 14s/step - accuracy: 0.8362 - loss: 0.5274 - mean_iou: 0.5531 - val_accuracy: 0.8108 - val_loss: 0.6027 - val_mean_iou: 0.5255
Epoch 27/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 336s 14s/step - accuracy: 0.8417 - loss: 0.5094 - mean_iou: 0.5675 - val_accuracy: 0.8110 - val_loss: 0.5941 - val_mean_iou: 0.5272
Epoch 28/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 309s 13s/step - accuracy: 0.8411 - loss: 0.5115 - mean_iou: 0.5635 - val_accuracy: 0.8156 - val_loss: 0.5801 - val_mean_iou: 0.5370
Epoch 29/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8527 - loss: 0.4742 - mean_iou: 0.5813 - val_accuracy: 0.8072 - val_loss: 0.6009 - val_mean_iou: 0.5310
Epoch 30/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8438 - loss: 0.5027 - mean_iou: 0.5741 - val_accuracy: 0.8101 - val_loss: 0.5978 - val_mean_iou: 0.5297
Epoch 31/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 330s 14s/step - accuracy: 0.8476 - loss: 0.4895 - mean_iou: 0.5779 - val_accuracy: 0.8174 - val_loss: 0.5719 - val_mean_iou: 0.5397
Epoch 32/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 311s 13s/step - accuracy: 0.8494 - loss: 0.4845 - mean_iou: 0.5793 - val_accuracy: 0.8139 - val_loss: 0.5813 - val_mean_iou: 0.5312
Epoch 33/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 313s 13s/step - accuracy: 0.8538 - loss: 0.4716 - mean_iou: 0.5859 - val_accuracy: 0.8146 - val_loss: 0.5894 - val_mean_iou: 0.5378
Epoch 34/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8555 - loss: 0.4642 - mean_iou: 0.5937 - val_accuracy: 0.8236 - val_loss: 0.5579 - val_mean_iou: 0.5431
Epoch 35/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 324s 14s/step - accuracy: 0.8576 - loss: 0.4578 - mean_iou: 0.5973 - val_accuracy: 0.8209 - val_loss: 0.5679 - val_mean_iou: 0.5489
Epoch 36/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.8604 - loss: 0.4465 - mean_iou: 0.6014 - val_accuracy: 0.8252 - val_loss: 0.5539 - val_mean_iou: 0.5490
Epoch 37/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 325s 14s/step - accuracy: 0.8570 - loss: 0.4579 - mean_iou: 0.5977 - val_accuracy: 0.8289 - val_loss: 0.5443 - val_mean_iou: 0.5544
Epoch 38/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.8600 - loss: 0.4496 - mean_iou: 0.6037 - val_accuracy: 0.8257 - val_loss: 0.5526 - val_mean_iou: 0.5535
Epoch 39/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 315s 14s/step - accuracy: 0.8610 - loss: 0.4449 - mean_iou: 0.6036 - val_accuracy: 0.8304 - val_loss: 0.5367 - val_mean_iou: 0.5637
Epoch 40/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 306s 13s/step - accuracy: 0.8638 - loss: 0.4352 - mean_iou: 0.6118 - val_accuracy: 0.8286 - val_loss: 0.5471 - val_mean_iou: 0.5599
Epoch 41/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 311s 13s/step - accuracy: 0.8638 - loss: 0.4325 - mean_iou: 0.6101 - val_accuracy: 0.8297 - val_loss: 0.5420 - val_mean_iou: 0.5635
Epoch 42/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.8705 - loss: 0.4131 - mean_iou: 0.6222 - val_accuracy: 0.8317 - val_loss: 0.5357 - val_mean_iou: 0.5686
Epoch 43/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8664 - loss: 0.4249 - mean_iou: 0.6174 - val_accuracy: 0.8276 - val_loss: 0.5400 - val_mean_iou: 0.5607
Epoch 44/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 325s 14s/step - accuracy: 0.8640 - loss: 0.4295 - mean_iou: 0.6139 - val_accuracy: 0.8294 - val_loss: 0.5452 - val_mean_iou: 0.5608
Epoch 45/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 314s 14s/step - accuracy: 0.8686 - loss: 0.4162 - mean_iou: 0.6197 - val_accuracy: 0.8296 - val_loss: 0.5451 - val_mean_iou: 0.5645
Epoch 46/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 328s 14s/step - accuracy: 0.8682 - loss: 0.4179 - mean_iou: 0.6213 - val_accuracy: 0.8230 - val_loss: 0.5722 - val_mean_iou: 0.5525
Epoch 47/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 312s 14s/step - accuracy: 0.8647 - loss: 0.4274 - mean_iou: 0.6180 - val_accuracy: 0.8248 - val_loss: 0.5608 - val_mean_iou: 0.5563

Evaluation

In [ ]:
plt.figure(figsize=(15,6))

plt.subplot(1,3,1)
plt.plot(model_history.history['val_loss'])
plt.plot(model_history.history['loss'])
plt.title("Fitting history: LOSS")
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper right')

plt.subplot(1,3,2)
plt.plot(model_history.history['val_accuracy'])
plt.plot(model_history.history['accuracy'])
plt.title("Fitting history: ACCURACY")
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper left')

plt.subplot(1,3,3)
plt.plot(model_history.history['val_mean_iou'])
plt.plot(model_history.history['mean_iou'])
plt.title("Fitting history: MEAN IOU")
plt.ylabel('Mean IoU')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper left')

plt.show()
No description has been provided for this image

Les courbes mettent en évidence un apprentissage régulier avec une bonne stabilité, avec toutefois un surapprentissage.

In [ ]:
from tensorflow.keras.models import load_model
In [ ]:
# sauvegarde
model.save('model.h5')
In [ ]:
# chargement
model = load_model('model.h5')

Prédictions

In [ ]:
# prédictions
pred_masks = model.predict(test_images)

# convertir les prédictions en masques de classe (chaque pixel prend la valeur de la classe avec la probabilité la plus élevée)
pred_masks = np.argmax(pred_masks, axis=-1)
4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 2s/step
In [ ]:
np.unique(pred_masks)
Out[ ]:
array([0, 1, 2, 3, 4, 5, 6, 7], dtype=int64)

Les 8 classes sont bien représentées.

In [ ]:
# visu
num_examples = 3
for i in range(num_examples):
    plt.figure(figsize=(15, 5))

    plt.subplot(1, 3, 1)
    plt.imshow(test_images[i])
    plt.title('Input Image')
    plt.axis('off')

    plt.subplot(1, 3, 2)
    plt.imshow(remapped_test_masks[i])
    plt.title('True Mask')
    plt.axis('off')

    plt.subplot(1, 3, 3)
    plt.imshow(pred_masks[i])
    plt.title('Predicted Mask')
    plt.axis('off')

    plt.show()
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Nous nous intéressons à la métrique IoU (Intersection over Union) : l'intersection représente le nombre de pixels communs entre les prédictions et la vérité-terrain, et l'union représente le chevauchement des deux. $$IoU = \frac{Aire\ intersection}{Aire\ union}$$ Il s'agit d'une métrique pertinente dans la tâche de segmentation sémantique dans le sens où elle nous apporte une observation du nombre de pixels bien classés, en d'autres termes la précision spatiale des masques de prédictions.

In [ ]:
def iou_metric(y_true, y_pred, num_classes):
    ious = []
    for cls in range(num_classes):
        intersection = np.logical_and(y_true == cls, y_pred == cls)
        union = np.logical_or(y_true == cls, y_pred == cls)
        if np.sum(union) == 0:
            iou_score = float('nan')  # évite la division par zéro
        else:
            iou_score = np.sum(intersection) / np.sum(union)
        ious.append(iou_score)
    return ious
In [ ]:
# IoU pour chaque classe
num_classes = 8
ious = [] # ious pour chaque classe pour chaque paire de masques
for true_mask, pred_mask in zip(remapped_test_masks, pred_masks):
    ious.append(iou_metric(true_mask, pred_mask, num_classes))

# IoU moyen
mean_iou_per_class = np.mean(ious, axis=0)
mean_iou = np.mean(mean_iou_per_class)
In [ ]:
mean_iou.round(2)
Out[ ]:
0.52
In [ ]:
mean_iou_per_class.round(2)
Out[ ]:
array([0.66, 0.85, 0.69, 0.11, 0.61, 0.58, 0.16, 0.5 ])
In [ ]:
# affichage
class_names = ['void', 'flat', 'construction', 'object', 'nature', 'sky', 'human', 'vehicle']

fig, ax = plt.subplots(figsize=(10, 6))
bars = ax.bar(class_names, mean_iou_per_class, color='lightblue')
ax.set_title('Mean IoU per class')
ax.set_xlabel('Class')
ax.set_ylabel('Mean IoU')
ax.set_ylim(0, 1)
ax.grid(axis='y', linestyle='--', color='lightgrey')

for bar in bars:
    height = bar.get_height()
    ax.annotate(f'{height:.2f}',
                xy=(bar.get_x() + bar.get_width() / 2, height),
                xytext=(0, 3),
                textcoords="offset points",
                ha='center', va='bottom')

plt.tight_layout()
plt.show()
No description has been provided for this image

On observe que les classes présentant le moins d'items bien classés sont les objets et les humains. Si l'on observe les masques prédits, on voit effectivement que les piétons sont assimilés à des voitures (ce qui n'est pas un résultat à améliorer ++ si l'on souhaite appliquer le modèle à la conduite autonome). Ce résultat s'explique par le fait que les objets et piétons sont les polygones les plus petits sur les images, donc ceux qui présentent le moins de pixels susceptibles d'être bien appris par le modèle. Par ailleurs, si l'on observe les images, on voit que les piétons sont assez faciles à distinguer par l'œil humain qui a une conscience de leur forme, mais en termes de contraste de pixels, ils peuvent être difficiles à différencier et à extraire pour un modèle.

Concernant les pixels des classes les mieux prédites, on voit que ce sont des régions vastes des photos, le modèle apprend donc sur beaucoup plus de pixels et est donc meilleur pour les détecter.

In [ ]:
!pip install scikit-learn
Collecting scikit-learn
  Downloading scikit_learn-1.5.0-cp311-cp311-win_amd64.whl.metadata (11 kB)
Requirement already satisfied: numpy>=1.19.5 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from scikit-learn) (1.26.4)
Requirement already satisfied: scipy>=1.6.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from scikit-learn) (1.12.0)
Collecting joblib>=1.2.0 (from scikit-learn)
  Downloading joblib-1.4.2-py3-none-any.whl.metadata (5.4 kB)
Collecting threadpoolctl>=3.1.0 (from scikit-learn)
  Downloading threadpoolctl-3.5.0-py3-none-any.whl.metadata (13 kB)
Downloading scikit_learn-1.5.0-cp311-cp311-win_amd64.whl (11.0 MB)
   ---------------------------------------- 0.0/11.0 MB ? eta -:--:--
   ---------------------------------------- 0.1/11.0 MB 2.6 MB/s eta 0:00:05
   - -------------------------------------- 0.4/11.0 MB 3.8 MB/s eta 0:00:03
   -- ------------------------------------- 0.6/11.0 MB 4.5 MB/s eta 0:00:03
   --- ------------------------------------ 1.0/11.0 MB 5.3 MB/s eta 0:00:02
   ---- ----------------------------------- 1.3/11.0 MB 5.6 MB/s eta 0:00:02
   ------ --------------------------------- 1.6/11.0 MB 5.8 MB/s eta 0:00:02
   ------- -------------------------------- 2.0/11.0 MB 6.3 MB/s eta 0:00:02
   -------- ------------------------------- 2.4/11.0 MB 6.5 MB/s eta 0:00:02
   --------- ------------------------------ 2.7/11.0 MB 6.6 MB/s eta 0:00:02
   ----------- ---------------------------- 3.1/11.0 MB 6.9 MB/s eta 0:00:02
   ------------ --------------------------- 3.5/11.0 MB 7.0 MB/s eta 0:00:02
   ------------- -------------------------- 3.8/11.0 MB 7.0 MB/s eta 0:00:02
   --------------- ------------------------ 4.2/11.0 MB 7.1 MB/s eta 0:00:01
   ---------------- ----------------------- 4.5/11.0 MB 7.1 MB/s eta 0:00:01
   ----------------- ---------------------- 4.8/11.0 MB 7.0 MB/s eta 0:00:01
   ------------------- -------------------- 5.2/11.0 MB 7.1 MB/s eta 0:00:01
   -------------------- ------------------- 5.5/11.0 MB 7.0 MB/s eta 0:00:01
   --------------------- ------------------ 5.9/11.0 MB 7.1 MB/s eta 0:00:01
   ---------------------- ----------------- 6.3/11.0 MB 7.2 MB/s eta 0:00:01
   ------------------------ --------------- 6.7/11.0 MB 7.3 MB/s eta 0:00:01
   -------------------------- ------------- 7.1/11.0 MB 7.4 MB/s eta 0:00:01
   --------------------------- ------------ 7.6/11.0 MB 7.4 MB/s eta 0:00:01
   ----------------------------- ---------- 8.0/11.0 MB 7.5 MB/s eta 0:00:01
   ------------------------------ --------- 8.4/11.0 MB 7.6 MB/s eta 0:00:01
   ------------------------------- -------- 8.8/11.0 MB 7.7 MB/s eta 0:00:01
   --------------------------------- ------ 9.1/11.0 MB 7.6 MB/s eta 0:00:01
   --------------------------------- ------ 9.3/11.0 MB 7.5 MB/s eta 0:00:01
   --------------------------------- ------ 9.3/11.0 MB 7.5 MB/s eta 0:00:01
   ------------------------------------ --- 9.9/11.0 MB 7.4 MB/s eta 0:00:01
   -------------------------------------- - 10.5/11.0 MB 7.8 MB/s eta 0:00:01
   ---------------------------------------  10.9/11.0 MB 8.0 MB/s eta 0:00:01
   ---------------------------------------  11.0/11.0 MB 7.9 MB/s eta 0:00:01
   ---------------------------------------  11.0/11.0 MB 7.9 MB/s eta 0:00:01
   ---------------------------------------- 11.0/11.0 MB 7.3 MB/s eta 0:00:00
Downloading joblib-1.4.2-py3-none-any.whl (301 kB)
   ---------------------------------------- 0.0/301.8 kB ? eta -:--:--
   ------------------------------- -------- 235.5/301.8 kB 7.3 MB/s eta 0:00:01
   ---------------------------------------- 301.8/301.8 kB 4.7 MB/s eta 0:00:00
Downloading threadpoolctl-3.5.0-py3-none-any.whl (18 kB)
Installing collected packages: threadpoolctl, joblib, scikit-learn
Successfully installed joblib-1.4.2 scikit-learn-1.5.0 threadpoolctl-3.5.0
In [ ]:
from sklearn.metrics import confusion_matrix

# aplatir les masques pour obtenir une liste d'étiquettes de pixels
y_true = remapped_test_masks.flatten()
y_pred = pred_masks.flatten()

conf_matrix = confusion_matrix(y_true, y_pred)

plt.figure(figsize=(10, 8))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues')
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.title('Confusion Matrix')
plt.show()
No description has been provided for this image
In [ ]:
from sklearn.metrics import classification_report

class_report = classification_report(y_true, y_pred)
print("Classification Report:\n")
print(class_report)
Classification Report:

              precision    recall  f1-score   support

           0       0.92      0.54      0.68    197494
           1       0.89      0.95      0.92    739743
           2       0.82      0.86      0.84    511198
           3       0.45      0.13      0.20     53572
           4       0.79      0.89      0.84    273454
           5       0.85      0.86      0.86     46727
           6       0.52      0.37      0.43     24347
           7       0.74      0.85      0.79    103161

    accuracy                           0.84   1949696
   macro avg       0.75      0.68      0.70   1949696
weighted avg       0.84      0.84      0.83   1949696

In [ ]:
precision = [.92, .89, .82, .45, .79, .85, .52, .74]
recall = [.54, .95, .86, .13, .89, .86, .37, .85]
f1_score = [.68, .92, .84, .20, .84, .86, .43, .79]

bar_width = 0.25
r1 = np.arange(len(precision))
r2 = [x + bar_width for x in r1]
r3 = [x + bar_width for x in r2]

plt.figure(figsize=(12, 6))
plt.bar(r1, precision, color='lightblue', width=bar_width, edgecolor='grey', label='Precision')
plt.bar(r2, recall, color='steelblue', width=bar_width, edgecolor='grey', label='Recall')
plt.bar(r3, f1_score, color='darkblue', width=bar_width, edgecolor='grey', label='F1 Score')

plt.xlabel('Class', fontweight='bold')
plt.ylabel('Score', fontweight='bold')
plt.title('Performance Metrics per Class')
plt.xticks([r + bar_width for r in range(len(precision))], class_names)

for i in range(len(precision)):
    plt.text(r1[i], precision[i] + 0.02, f'{precision[i]:.2f}', ha='center', va='bottom')
    plt.text(r2[i], recall[i] + 0.02, f'{recall[i]:.2f}', ha='center', va='bottom')
    plt.text(r3[i], f1_score[i] + 0.02, f'{f1_score[i]:.2f}', ha='center', va='bottom')

plt.ylim(0, 1)
plt.grid(axis='y', linestyle='--', color='lightgrey')
plt.legend()
plt.tight_layout()
plt.show()
No description has been provided for this image

On observe les mêmes tendances concernant les métriques précision, rappel et f1-score que pour l'IoU, à savoir de moindres performances sur les classes objets et humains. Les métriques sont relativement cohérentes à l'intérieur d'une même classe, à part pour 'void' qui présente une bonne précision mais un faible rappel (idem pour 'object'). Par ex. pour les objets, cela signifie que lorsque le modèle prédit un objet alors cela s'avère à 45% juste (prédiction), mais il peine à les détecter (13% de rappel). Comme nous l'avons mentionné, cela s'explique par une sous-représentation de ces classes dans le dataset.

Nous allons utiliser la librairie optuna afin d'optimiser les hyperparamètres de manière automatisée. Puis nous utiliserons l'objet de sortie d'optuna pour réentraîner le modèle avec les meilleurs hyperparamètres. Nous avons choisi de faire varier le nombre filtres pour chaque couche de neurones, ainsi que le learning rate.

Utilisation d'optuna pour améliorer les hyperparametres

In [ ]:
!pip install optuna
!pip install optuna-integration
Collecting optuna
  Downloading optuna-3.6.1-py3-none-any.whl.metadata (17 kB)
Collecting alembic>=1.5.0 (from optuna)
  Downloading alembic-1.13.1-py3-none-any.whl.metadata (7.4 kB)
Collecting colorlog (from optuna)
  Downloading colorlog-6.8.2-py3-none-any.whl.metadata (10 kB)
Requirement already satisfied: numpy in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna) (1.26.4)
Requirement already satisfied: packaging>=20.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna) (23.2)
Collecting sqlalchemy>=1.3.0 (from optuna)
  Downloading SQLAlchemy-2.0.30-cp311-cp311-win_amd64.whl.metadata (9.8 kB)
Requirement already satisfied: tqdm in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna) (4.66.4)
Requirement already satisfied: PyYAML in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna) (6.0.1)
Collecting Mako (from alembic>=1.5.0->optuna)
  Downloading Mako-1.3.5-py3-none-any.whl.metadata (2.9 kB)
Requirement already satisfied: typing-extensions>=4 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from alembic>=1.5.0->optuna) (4.10.0)
Collecting greenlet!=0.4.17 (from sqlalchemy>=1.3.0->optuna)
  Downloading greenlet-3.0.3-cp311-cp311-win_amd64.whl.metadata (3.9 kB)
Requirement already satisfied: colorama in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from colorlog->optuna) (0.4.6)
Requirement already satisfied: MarkupSafe>=0.9.2 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from Mako->alembic>=1.5.0->optuna) (2.1.5)
Downloading optuna-3.6.1-py3-none-any.whl (380 kB)
   ---------------------------------------- 0.0/380.1 kB ? eta -:--:--
   ------- -------------------------------- 71.7/380.1 kB 1.9 MB/s eta 0:00:01
   ------------ --------------------------- 122.9/380.1 kB 1.8 MB/s eta 0:00:01
   -------------------------- ------------- 256.0/380.1 kB 2.0 MB/s eta 0:00:01
   ------------------------------------- -- 358.4/380.1 kB 2.0 MB/s eta 0:00:01
   ---------------------------------------- 380.1/380.1 kB 2.0 MB/s eta 0:00:00
Downloading alembic-1.13.1-py3-none-any.whl (233 kB)
   ---------------------------------------- 0.0/233.4 kB ? eta -:--:--
   --------------- ------------------------ 92.2/233.4 kB 1.7 MB/s eta 0:00:01
   ------------------- -------------------- 112.6/233.4 kB 1.3 MB/s eta 0:00:01
   ---------------------------------------- 233.4/233.4 kB 1.8 MB/s eta 0:00:00
Downloading SQLAlchemy-2.0.30-cp311-cp311-win_amd64.whl (2.1 MB)
   ---------------------------------------- 0.0/2.1 MB ? eta -:--:--
   - -------------------------------------- 0.1/2.1 MB 3.8 MB/s eta 0:00:01
   --- ------------------------------------ 0.2/2.1 MB 2.6 MB/s eta 0:00:01
   ----- ---------------------------------- 0.3/2.1 MB 2.3 MB/s eta 0:00:01
   ------- -------------------------------- 0.4/2.1 MB 2.3 MB/s eta 0:00:01
   --------- ------------------------------ 0.5/2.1 MB 2.3 MB/s eta 0:00:01
   ---------- ----------------------------- 0.6/2.1 MB 2.2 MB/s eta 0:00:01
   ------------ --------------------------- 0.7/2.1 MB 2.2 MB/s eta 0:00:01
   -------------- ------------------------- 0.8/2.1 MB 2.2 MB/s eta 0:00:01
   ---------------- ----------------------- 0.9/2.1 MB 2.2 MB/s eta 0:00:01
   ------------------ --------------------- 1.0/2.1 MB 2.2 MB/s eta 0:00:01
   -------------------- ------------------- 1.1/2.1 MB 2.2 MB/s eta 0:00:01
   ---------------------- ----------------- 1.2/2.1 MB 2.2 MB/s eta 0:00:01
   ------------------------ --------------- 1.3/2.1 MB 2.2 MB/s eta 0:00:01
   -------------------------- ------------- 1.4/2.1 MB 2.2 MB/s eta 0:00:01
   ---------------------------- ----------- 1.5/2.1 MB 2.2 MB/s eta 0:00:01
   ------------------------------ --------- 1.6/2.1 MB 2.2 MB/s eta 0:00:01
   ------------------------------- -------- 1.7/2.1 MB 2.2 MB/s eta 0:00:01
   --------------------------------- ------ 1.8/2.1 MB 2.2 MB/s eta 0:00:01
   ----------------------------------- ---- 1.9/2.1 MB 2.2 MB/s eta 0:00:01
   ------------------------------------- -- 2.0/2.1 MB 2.2 MB/s eta 0:00:01
   ---------------------------------------  2.1/2.1 MB 2.2 MB/s eta 0:00:01
   ---------------------------------------- 2.1/2.1 MB 2.1 MB/s eta 0:00:00
Downloading colorlog-6.8.2-py3-none-any.whl (11 kB)
Downloading greenlet-3.0.3-cp311-cp311-win_amd64.whl (292 kB)
   ---------------------------------------- 0.0/292.8 kB ? eta -:--:--
   --------- ------------------------------ 71.7/292.8 kB 2.0 MB/s eta 0:00:01
   ----------------------- ---------------- 174.1/292.8 kB 2.1 MB/s eta 0:00:01
   ------------------------------------- -- 276.5/292.8 kB 2.1 MB/s eta 0:00:01
   ---------------------------------------- 292.8/292.8 kB 2.0 MB/s eta 0:00:00
Downloading Mako-1.3.5-py3-none-any.whl (78 kB)
   ---------------------------------------- 0.0/78.6 kB ? eta -:--:--
   ------------------------------------ --- 71.7/78.6 kB 1.9 MB/s eta 0:00:01
   ---------------------------------------- 78.6/78.6 kB 1.1 MB/s eta 0:00:00
Installing collected packages: Mako, greenlet, colorlog, sqlalchemy, alembic, optuna
Successfully installed Mako-1.3.5 alembic-1.13.1 colorlog-6.8.2 greenlet-3.0.3 optuna-3.6.1 sqlalchemy-2.0.30
Collecting optuna-integration
  Downloading optuna_integration-3.6.0-py3-none-any.whl.metadata (10 kB)
Requirement already satisfied: optuna in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna-integration) (3.6.1)
Requirement already satisfied: alembic>=1.5.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (1.13.1)
Requirement already satisfied: colorlog in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (6.8.2)
Requirement already satisfied: numpy in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (1.26.4)
Requirement already satisfied: packaging>=20.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (23.2)
Requirement already satisfied: sqlalchemy>=1.3.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (2.0.30)
Requirement already satisfied: tqdm in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (4.66.4)
Requirement already satisfied: PyYAML in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (6.0.1)
Requirement already satisfied: Mako in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from alembic>=1.5.0->optuna->optuna-integration) (1.3.5)
Requirement already satisfied: typing-extensions>=4 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from alembic>=1.5.0->optuna->optuna-integration) (4.10.0)
Requirement already satisfied: greenlet!=0.4.17 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from sqlalchemy>=1.3.0->optuna->optuna-integration) (3.0.3)
Requirement already satisfied: colorama in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from colorlog->optuna->optuna-integration) (0.4.6)
Requirement already satisfied: MarkupSafe>=0.9.2 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from Mako->alembic>=1.5.0->optuna->optuna-integration) (2.1.5)
Downloading optuna_integration-3.6.0-py3-none-any.whl (93 kB)
   ---------------------------------------- 0.0/93.4 kB ? eta -:--:--
   -------- ------------------------------- 20.5/93.4 kB 640.0 kB/s eta 0:00:01
   ----------------------------------- ---- 81.9/93.4 kB 1.1 MB/s eta 0:00:01
   ---------------------------------------- 93.4/93.4 kB 1.1 MB/s eta 0:00:00
Installing collected packages: optuna-integration
Successfully installed optuna-integration-3.6.0
In [ ]:
import optuna
from optuna.integration import TFKerasPruningCallback
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Conv2DTranspose, concatenate
from tensorflow.keras.models import Model

def unet_model(trial):
    inputs = Input(shape=(128, 128, 3))
    
    # hyperparameters to tune MODIFY HERE to add or remove hyperparameters optuna will search for the best value
    n_filters = trial.suggest_categorical('n_filters', [32, 64])
    learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-3)
    
    # encoder
    conv1 = Conv2D(n_filters, 3, activation='relu', padding='same')(inputs)
    conv1 = Conv2D(n_filters, 3, activation='relu', padding='same')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

    conv2 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(pool1)
    conv2 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

    # bridge
    conv3 = Conv2D(n_filters * 4, 3, activation='relu', padding='same')(pool2)
    conv3 = Conv2D(n_filters * 4, 3, activation='relu', padding='same')(conv3)

    # decoder
    up4 = Conv2DTranspose(n_filters * 2, (2, 2), strides=(2, 2), padding='same')(conv3)
    up4 = concatenate([conv2, up4], axis=3)
    conv4 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(up4)
    conv4 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(conv4)

    up5 = Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv4)
    up5 = concatenate([conv1, up5], axis=3)
    conv5 = Conv2D(n_filters, 3, activation='relu', padding='same')(up5)
    conv5 = Conv2D(n_filters, 3, activation='relu', padding='same')(conv5)

    # output layer
    outputs = Conv2D(8, 1, activation='softmax')(conv5)

    model = Model(inputs=[inputs], outputs=[outputs])

    model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate),
                  loss=tf.keras.losses.SparseCategoricalCrossentropy(),
                  metrics=['accuracy', UpdatedMeanIoU(num_classes=8)])
    return model

def objective(trial):
    model = unet_model(trial)
    
    TRAIN_LENGTH = len(train_images)
    BATCH_SIZE = 64
    BUFFER_SIZE = 1000
    STEPS_PER_EPOCH = TRAIN_LENGTH // BATCH_SIZE
    EPOCHS = 100
    VAL_SUBSPLITS = 5
    VALIDATION_STEPS = len(val_images) // BATCH_SIZE // VAL_SUBSPLITS

    # préparation des datasets
    train_dataset = tf.data.Dataset.from_tensor_slices((train_images, remapped_train_masks))
    train_dataset = train_dataset.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()

    val_dataset = tf.data.Dataset.from_tensor_slices((val_images, remapped_val_masks))
    val_dataset = val_dataset.batch(BATCH_SIZE)

    history = model.fit(train_dataset,
                        epochs=EPOCHS,
                        steps_per_epoch=STEPS_PER_EPOCH,
                        validation_data=val_dataset,
                        validation_steps=VALIDATION_STEPS,
                        callbacks=[TFKerasPruningCallback(trial, 'val_loss')],
                        verbose=0)
    
    return min(history.history['val_loss'])

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=1) # MODIFY HERE n_trials to increase the number of trials the optuna will run to find the best hyperparameters
[I 2024-06-05 09:13:46,221] A new study created in memory with name: no-name-976d0bc8-da94-4151-8588-4a8dc8d08c64
C:\Users\Engasser Ophélie\AppData\Local\Temp\ipykernel_39152\3779512948.py:12: FutureWarning: suggest_loguniform has been deprecated in v3.0.0. This feature will be removed in v6.0.0. See https://github.com/optuna/optuna/releases/tag/v3.0.0. Use suggest_float(..., log=True) instead.
  learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-3)
[I 2024-06-05 19:59:41,910] Trial 0 finished with value: 0.5540214776992798 and parameters: {'n_filters': 64, 'learning_rate': 0.00027647990450630345}. Best is trial 0 with value: 0.5540214776992798.

A présent il est possible d'entraîner un nouveau modèle avec les meilleurs hyperparamètres.

In [ ]:
from tensorflow.keras.callbacks import EarlyStopping

best_params = study.best_params
best_params

def unet_model(best_params):
    inputs = Input(shape=(128, 128, 3))
    
    # hyperparameters to tune MODIFY HERE to add or remove hyperparameters optuna will search for the best value
    n_filters = best_params['n_filters']
    
    # encoder
    conv1 = Conv2D(n_filters, 3, activation='relu', padding='same')(inputs)
    conv1 = Conv2D(n_filters, 3, activation='relu', padding='same')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

    conv2 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(pool1)
    conv2 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

    # bridge
    conv3 = Conv2D(n_filters * 4, 3, activation='relu', padding='same')(pool2)
    conv3 = Conv2D(n_filters * 4, 3, activation='relu', padding='same')(conv3)

    # decoder
    up4 = Conv2DTranspose(n_filters * 2, (2, 2), strides=(2, 2), padding='same')(conv3)
    up4 = concatenate([conv2, up4], axis=3)
    conv4 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(up4)
    conv4 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(conv4)

    up5 = Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv4)
    up5 = concatenate([conv1, up5], axis=3)
    conv5 = Conv2D(n_filters, 3, activation='relu', padding='same')(up5)
    conv5 = Conv2D(n_filters, 3, activation='relu', padding='same')(conv5)

    # output layer
    outputs = Conv2D(8, 1, activation='softmax')(conv5)

    model = Model(inputs=[inputs], outputs=[outputs])

    return model

model = unet_model(best_params)

TRAIN_LENGTH = len(train_images)
BATCH_SIZE = 64
BUFFER_SIZE = 1000
STEPS_PER_EPOCH = TRAIN_LENGTH // BATCH_SIZE
EPOCHS = 100
model.compile(optimizer=tf.keras.optimizers.Adam(best_params['learning_rate']),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(),
              metrics=['accuracy', UpdatedMeanIoU(num_classes=8, name = "mean_iou")])

model_history = model.fit(train_dataset, epochs=EPOCHS,
                          steps_per_epoch=STEPS_PER_EPOCH,
                          validation_steps=VALIDATION_STEPS,
                          validation_data=val_dataset,
                          callbacks=[EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)])
Epoch 1/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 362s 14s/step - accuracy: 0.3388 - loss: 1.9724 - mean_iou: 0.0575 - val_accuracy: 0.3924 - val_loss: 1.7324 - val_mean_iou: 0.0491
Epoch 2/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 15s/step - accuracy: 0.3887 - loss: 1.6892 - mean_iou: 0.0501 - val_accuracy: 0.4852 - val_loss: 1.4667 - val_mean_iou: 0.1056
Epoch 3/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 339s 15s/step - accuracy: 0.4986 - loss: 1.4482 - mean_iou: 0.1186 - val_accuracy: 0.5406 - val_loss: 1.3217 - val_mean_iou: 0.1659
Epoch 4/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 342s 15s/step - accuracy: 0.5664 - loss: 1.2370 - mean_iou: 0.1912 - val_accuracy: 0.5960 - val_loss: 1.1735 - val_mean_iou: 0.2541
Epoch 5/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 335s 14s/step - accuracy: 0.6435 - loss: 1.0643 - mean_iou: 0.3036 - val_accuracy: 0.6325 - val_loss: 1.0842 - val_mean_iou: 0.2890
Epoch 6/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 350s 15s/step - accuracy: 0.6602 - loss: 1.0020 - mean_iou: 0.3230 - val_accuracy: 0.6475 - val_loss: 1.0339 - val_mean_iou: 0.3062
Epoch 7/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 324s 14s/step - accuracy: 0.6835 - loss: 0.9364 - mean_iou: 0.3420 - val_accuracy: 0.6759 - val_loss: 0.9685 - val_mean_iou: 0.3213
Epoch 8/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 327s 14s/step - accuracy: 0.6964 - loss: 0.9046 - mean_iou: 0.3607 - val_accuracy: 0.6999 - val_loss: 0.9186 - val_mean_iou: 0.3660
Epoch 9/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 329s 14s/step - accuracy: 0.7222 - loss: 0.8489 - mean_iou: 0.3964 - val_accuracy: 0.7108 - val_loss: 0.9014 - val_mean_iou: 0.3811
Epoch 10/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 342s 15s/step - accuracy: 0.7193 - loss: 0.8585 - mean_iou: 0.3983 - val_accuracy: 0.7205 - val_loss: 0.8665 - val_mean_iou: 0.3910
Epoch 11/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 339s 15s/step - accuracy: 0.7391 - loss: 0.8051 - mean_iou: 0.4184 - val_accuracy: 0.7089 - val_loss: 0.8741 - val_mean_iou: 0.3875
Epoch 12/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 327s 14s/step - accuracy: 0.7466 - loss: 0.7810 - mean_iou: 0.4311 - val_accuracy: 0.7326 - val_loss: 0.8385 - val_mean_iou: 0.4087
Epoch 13/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 333s 14s/step - accuracy: 0.7481 - loss: 0.7813 - mean_iou: 0.4298 - val_accuracy: 0.7220 - val_loss: 0.8540 - val_mean_iou: 0.3987
Epoch 14/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 355s 15s/step - accuracy: 0.7607 - loss: 0.7500 - mean_iou: 0.4412 - val_accuracy: 0.7448 - val_loss: 0.8016 - val_mean_iou: 0.4163
Epoch 15/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 339s 15s/step - accuracy: 0.7500 - loss: 0.7748 - mean_iou: 0.4304 - val_accuracy: 0.7326 - val_loss: 0.8210 - val_mean_iou: 0.4054
Epoch 16/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 355s 16s/step - accuracy: 0.7534 - loss: 0.7684 - mean_iou: 0.4355 - val_accuracy: 0.7289 - val_loss: 0.8128 - val_mean_iou: 0.4042
Epoch 17/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 356s 15s/step - accuracy: 0.7718 - loss: 0.7130 - mean_iou: 0.4529 - val_accuracy: 0.7346 - val_loss: 0.8013 - val_mean_iou: 0.4193
Epoch 18/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 343s 15s/step - accuracy: 0.7792 - loss: 0.6963 - mean_iou: 0.4628 - val_accuracy: 0.7499 - val_loss: 0.7659 - val_mean_iou: 0.4290
Epoch 19/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.7783 - loss: 0.6984 - mean_iou: 0.4634 - val_accuracy: 0.7221 - val_loss: 0.8300 - val_mean_iou: 0.4045
Epoch 20/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 355s 15s/step - accuracy: 0.7752 - loss: 0.7015 - mean_iou: 0.4611 - val_accuracy: 0.7413 - val_loss: 0.7862 - val_mean_iou: 0.4231
Epoch 21/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 352s 15s/step - accuracy: 0.7795 - loss: 0.6951 - mean_iou: 0.4642 - val_accuracy: 0.7559 - val_loss: 0.7511 - val_mean_iou: 0.4286
Epoch 22/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 373s 16s/step - accuracy: 0.7850 - loss: 0.6788 - mean_iou: 0.4708 - val_accuracy: 0.7508 - val_loss: 0.7553 - val_mean_iou: 0.4342
Epoch 23/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 343s 15s/step - accuracy: 0.7920 - loss: 0.6575 - mean_iou: 0.4767 - val_accuracy: 0.7602 - val_loss: 0.7336 - val_mean_iou: 0.4378
Epoch 24/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 357s 15s/step - accuracy: 0.7940 - loss: 0.6539 - mean_iou: 0.4811 - val_accuracy: 0.7483 - val_loss: 0.7583 - val_mean_iou: 0.4276
Epoch 25/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 359s 16s/step - accuracy: 0.7930 - loss: 0.6528 - mean_iou: 0.4799 - val_accuracy: 0.7705 - val_loss: 0.7028 - val_mean_iou: 0.4537
Epoch 26/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 343s 15s/step - accuracy: 0.8005 - loss: 0.6400 - mean_iou: 0.4905 - val_accuracy: 0.7618 - val_loss: 0.7243 - val_mean_iou: 0.4430
Epoch 27/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 347s 15s/step - accuracy: 0.8011 - loss: 0.6310 - mean_iou: 0.4915 - val_accuracy: 0.7751 - val_loss: 0.6858 - val_mean_iou: 0.4591
Epoch 28/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 356s 16s/step - accuracy: 0.8048 - loss: 0.6240 - mean_iou: 0.4930 - val_accuracy: 0.7786 - val_loss: 0.6804 - val_mean_iou: 0.4624
Epoch 29/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 338s 15s/step - accuracy: 0.8083 - loss: 0.6147 - mean_iou: 0.4982 - val_accuracy: 0.7770 - val_loss: 0.6779 - val_mean_iou: 0.4597
Epoch 30/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 337s 15s/step - accuracy: 0.8122 - loss: 0.6012 - mean_iou: 0.5008 - val_accuracy: 0.7726 - val_loss: 0.6905 - val_mean_iou: 0.4536
Epoch 31/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 365s 16s/step - accuracy: 0.8113 - loss: 0.6038 - mean_iou: 0.5004 - val_accuracy: 0.7718 - val_loss: 0.7030 - val_mean_iou: 0.4621
Epoch 32/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 355s 15s/step - accuracy: 0.8113 - loss: 0.6055 - mean_iou: 0.5036 - val_accuracy: 0.7736 - val_loss: 0.6902 - val_mean_iou: 0.4612
Epoch 33/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 357s 15s/step - accuracy: 0.8155 - loss: 0.5889 - mean_iou: 0.5081 - val_accuracy: 0.7758 - val_loss: 0.6816 - val_mean_iou: 0.4614
Epoch 34/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 360s 15s/step - accuracy: 0.8171 - loss: 0.5901 - mean_iou: 0.5114 - val_accuracy: 0.7802 - val_loss: 0.6742 - val_mean_iou: 0.4696
Epoch 35/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 357s 16s/step - accuracy: 0.8098 - loss: 0.6073 - mean_iou: 0.5021 - val_accuracy: 0.7874 - val_loss: 0.6605 - val_mean_iou: 0.4715
Epoch 36/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.8174 - loss: 0.5874 - mean_iou: 0.5097 - val_accuracy: 0.7887 - val_loss: 0.6482 - val_mean_iou: 0.4813
Epoch 37/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 338s 15s/step - accuracy: 0.8195 - loss: 0.5749 - mean_iou: 0.5147 - val_accuracy: 0.7812 - val_loss: 0.6739 - val_mean_iou: 0.4724
Epoch 38/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 343s 15s/step - accuracy: 0.8201 - loss: 0.5788 - mean_iou: 0.5149 - val_accuracy: 0.7906 - val_loss: 0.6457 - val_mean_iou: 0.4850
Epoch 39/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.8239 - loss: 0.5652 - mean_iou: 0.5212 - val_accuracy: 0.7909 - val_loss: 0.6424 - val_mean_iou: 0.4818
Epoch 40/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 342s 15s/step - accuracy: 0.8214 - loss: 0.5741 - mean_iou: 0.5189 - val_accuracy: 0.7888 - val_loss: 0.6486 - val_mean_iou: 0.4869
Epoch 41/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 340s 15s/step - accuracy: 0.8270 - loss: 0.5577 - mean_iou: 0.5267 - val_accuracy: 0.7850 - val_loss: 0.6633 - val_mean_iou: 0.4848
Epoch 42/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 345s 15s/step - accuracy: 0.8174 - loss: 0.5833 - mean_iou: 0.5161 - val_accuracy: 0.7982 - val_loss: 0.6238 - val_mean_iou: 0.4946
Epoch 43/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 349s 15s/step - accuracy: 0.8212 - loss: 0.5704 - mean_iou: 0.5222 - val_accuracy: 0.7900 - val_loss: 0.6502 - val_mean_iou: 0.4929
Epoch 44/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 354s 15s/step - accuracy: 0.8273 - loss: 0.5580 - mean_iou: 0.5272 - val_accuracy: 0.7796 - val_loss: 0.6762 - val_mean_iou: 0.4747
Epoch 45/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 349s 15s/step - accuracy: 0.8280 - loss: 0.5540 - mean_iou: 0.5324 - val_accuracy: 0.7908 - val_loss: 0.6435 - val_mean_iou: 0.4902
Epoch 46/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 347s 15s/step - accuracy: 0.8328 - loss: 0.5390 - mean_iou: 0.5388 - val_accuracy: 0.7734 - val_loss: 0.6956 - val_mean_iou: 0.4763
Epoch 47/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.8256 - loss: 0.5562 - mean_iou: 0.5346 - val_accuracy: 0.8012 - val_loss: 0.6147 - val_mean_iou: 0.5102
Epoch 48/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 344s 15s/step - accuracy: 0.8314 - loss: 0.5391 - mean_iou: 0.5441 - val_accuracy: 0.7951 - val_loss: 0.6350 - val_mean_iou: 0.4911
Epoch 49/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.8224 - loss: 0.5681 - mean_iou: 0.5249 - val_accuracy: 0.7976 - val_loss: 0.6236 - val_mean_iou: 0.5016
Epoch 50/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 344s 15s/step - accuracy: 0.8287 - loss: 0.5512 - mean_iou: 0.5394 - val_accuracy: 0.7973 - val_loss: 0.6266 - val_mean_iou: 0.4971
Epoch 51/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 345s 15s/step - accuracy: 0.8355 - loss: 0.5280 - mean_iou: 0.5486 - val_accuracy: 0.8095 - val_loss: 0.5902 - val_mean_iou: 0.5195
Epoch 52/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 350s 15s/step - accuracy: 0.8369 - loss: 0.5243 - mean_iou: 0.5474 - val_accuracy: 0.8045 - val_loss: 0.6042 - val_mean_iou: 0.5155
Epoch 53/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 349s 15s/step - accuracy: 0.8379 - loss: 0.5189 - mean_iou: 0.5603 - val_accuracy: 0.8051 - val_loss: 0.6045 - val_mean_iou: 0.5156
Epoch 54/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 341s 15s/step - accuracy: 0.8365 - loss: 0.5259 - mean_iou: 0.5503 - val_accuracy: 0.7937 - val_loss: 0.6391 - val_mean_iou: 0.5020
Epoch 55/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 346s 15s/step - accuracy: 0.8365 - loss: 0.5239 - mean_iou: 0.5589 - val_accuracy: 0.7990 - val_loss: 0.6260 - val_mean_iou: 0.5087
Epoch 56/100
23/23 ━━━━━━━━━━━━━━━━━━━━ 340s 15s/step - accuracy: 0.8426 - loss: 0.5113 - mean_iou: 0.5591 - val_accuracy: 0.7999 - val_loss: 0.6169 - val_mean_iou: 0.5123
In [ ]:
plt.figure(figsize=(15,6))

plt.subplot(1,3,1)
plt.plot(model_history.history['val_loss'])
plt.plot(model_history.history['loss'])
plt.title("Fitting history: LOSS")
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper right')

plt.subplot(1,3,2)
plt.plot(model_history.history['val_accuracy'])
plt.plot(model_history.history['accuracy'])
plt.title("Fitting history: ACCURACY")
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper left')

plt.subplot(1,3,3)
plt.plot(model_history.history['val_mean_iou'])
plt.plot(model_history.history['mean_iou'])
plt.title("Fitting history: MEAN IOU")
plt.ylabel('Mean IoU')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper left')

plt.show()
No description has been provided for this image
In [ ]:
# sauvegarde
model.save('model_optim.h5')
In [ ]:
# chargement
model_optim = load_model('model_optim.h5')
In [ ]:
# prédictions
pred_masks_optim = model_optim.predict(test_images)

# convertir les prédictions en masques de classe (chaque pixel prend la valeur de la classe avec la probabilité la plus élevée)
pred_masks_optim = np.argmax(pred_masks_optim, axis=-1)
4/4 ━━━━━━━━━━━━━━━━━━━━ 9s 2s/step
In [ ]:
np.unique(pred_masks_optim)
Out[ ]:
array([0, 1, 2, 3, 4, 5, 6, 7], dtype=int64)
In [ ]:
# visu
num_examples = 3
for i in range(num_examples):
    plt.figure(figsize=(15, 5))

    plt.subplot(1, 3, 1)
    plt.imshow(test_images[i])
    plt.title('Input Image')
    plt.axis('off')

    plt.subplot(1, 3, 2)
    plt.imshow(remapped_test_masks[i])
    plt.title('True Mask')
    plt.axis('off')

    plt.subplot(1, 3, 3)
    plt.imshow(pred_masks_optim[i])
    plt.title('Predicted Mask')
    plt.axis('off')

    plt.show()
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
In [ ]:
# IoU pour chaque classe
num_classes = 8
ious = [] # ious pour chaque classe pour chaque paire de masques
for true_mask, pred_mask in zip(remapped_test_masks, pred_masks_optim):
    ious.append(iou_metric(true_mask, pred_masks_optim, num_classes))

# IoU moyen
mean_iou_per_class = np.mean(ious, axis=0)
mean_iou = np.mean(mean_iou_per_class)
In [ ]:
mean_iou.round(2)
Out[ ]:
0.24
In [ ]:
mean_iou_per_class.round(2)
Out[ ]:
array([0.56, 0.69, 0.32, 0.01, 0.16, 0.09, 0.01, 0.08])
In [ ]:
plt.figure(figsize=(10, 6))
bars = plt.bar(class_names, mean_iou_per_class, color='lightblue', edgecolor='grey')

for bar, value in zip(bars, mean_iou_per_class):
    plt.text(bar.get_x() + bar.get_width() / 2, bar.get_height() + 0.01, f'{value:.2f}', ha='center', va='bottom')

plt.title('Mean IoU per Class')
plt.xlabel('Class')
plt.ylabel('Mean IoU')
plt.ylim(0, 1)  # Limiter l'axe des y de 0 à 1
plt.grid(axis='y', linestyle='--', color='lightgrey')
plt.tight_layout()
plt.show()
No description has been provided for this image
In [ ]:
# aplatir les masques pour obtenir une liste d'étiquettes de pixels
y_true = remapped_test_masks.flatten()
y_pred = pred_masks_optim.flatten()

conf_matrix = confusion_matrix(y_true, y_pred)

plt.figure(figsize=(10, 8))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues')
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.title('Confusion Matrix')
plt.show()
No description has been provided for this image
In [ ]:
class_report = classification_report(y_true, y_pred)
print("Classification Report:\n")
print(class_report)
Classification Report:

              precision    recall  f1-score   support

           0       0.93      0.49      0.64    197494
           1       0.86      0.96      0.91    739743
           2       0.81      0.82      0.82    511198
           3       0.56      0.04      0.08     53572
           4       0.74      0.89      0.81    273454
           5       0.83      0.85      0.84     46727
           6       0.58      0.12      0.19     24347
           7       0.69      0.82      0.75    103161

    accuracy                           0.82   1949696
   macro avg       0.75      0.62      0.63   1949696
weighted avg       0.82      0.82      0.80   1949696

Discussion

Nous pouvons aller plus loin, en optimisant le modèle et cela en appliquant plusieurs méthodes : Hyperopt, Optuna...

L'optimisation avec Optuna n'a pas permis d'améliorer les performances du modèle, elles sont même légèrement inférieures à celles obtenues avec le modèle de base. Il est à noter que nous n'avons fait varier, pour des raisons de manque de ressources en calcul, que les paramètres n_filters et learning_rate (et même avec cela, la pipeline a tourné 8 heures). Un axe d'amélioration serait d'augmenter la pipeline avec plus d'hyperparamètres à tester, et ce sur des plages plus larges.

Par ailleurs, nous avons observé au cours de ce travail que l'augmentation progressive du nombre de villes dans les données d'entraînement a permis d'améliorer considérablement le modèle ainsi que de le rendre plus stable. Un autre axe d'amélioration serait d'augmenter encore le nombre de données d'entrée, voire, si cela était possible, d'utiliser l'entièreté des données.

Une autre remarque concerne enfin les limites d'un modèle de segmentation sémantique d'images pour un cas d'usage de conduite autonome. En effet, nous avons ici utilisé 8 catégories d'objets potentiels, le dataset en contient 34. Mais en réalité, une scène visuelle en conduite automobile admet un nombre beaucoup plus important de classes.

Pour aller plus loin, nous avons conçu une application Flask permettant à un utilisateur de charger une image ou une vidéo, et d'appliquer notre modèle en temps réel (cf. fichiers dédiés).

Hover Effect

MERCI MADAME

ROMY APPLICATION

In [ ]:
import os
from IPython.display import display, Image

# define the folder containing images
image_folder = 'images'

# get a list of all image files in the folder
image_files = [f for f in os.listdir(image_folder) if f.endswith(('.png', '.jpg', '.jpeg', '.gif', '.bmp', '.tiff'))]

# display each image
for image_file in image_files:
    image_path = os.path.join(image_folder, image_file)
    display(Image(filename=image_path))
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image